Page of Spammy Goodness

Standard

I’ve got a draft of my “it’s pretty much my own fault but I still blame Small Business Server” weekend fun post going, but haven’t had the time to finish it yet. Will get to that this week, but found something even more important that I had to note:

After receiving a few odd emails I started checking server logs and found that this page, which was created as a dramatic illustration for this post, is getting a surprising amount of traffic. Then I discovered that according to google I own the phrase “spammy goodness!”

I’m so proud. Another head to put up on the wall with bayesian avocado and alien abductions

Breaking News: Yahoo Entering “Search” Market

Standard

Catching up on yesterday’s reading I came across an Infoworld story on customer satisfaction levels for Web portals, search engines, and news and information sites in the US.

It’s a nice followup to yesterday’s post, as the final paragraph reads:

[CEO of ForeSee Results Larry] Freed acknowledged that the boundaries between search engines, portals and news and information are getting increasingly blurred, and that, accordingly, competition is also crossing frontiers. For example, Google has been adding features and services that put it in competition with portals, while Yahoo and MSN are likewise expanding into search engine territory, he said.

Wow. Yahoo is expanding into search engine territory, huh? Who could have predicted that something like that might happen?

It Used to Suck to be a Search Engine

Standard

The New York Times yesterday published an op-ed piece entitled More Is Not Necessarily Better, which raises some concerns regarding “the influence of [Web] search companies in determining what users worldwide can see and do online.” Read it, it’s interesting.

If you’re a regular reader of this blog, however, it should come as no surprise that I don’t entirely agree with the authors. I’m sure it’s intended to get people thinking about the degree to which we depend on search engines without realizing it, but it takes an approach that seems rather alarmist to me. Let’s start at the beginning:

## BEGIN NYT QUOTE
Imagine if one company controlled the card catalog of every library in the world. The influence it would have over what people see, read and discuss would be enormous. Now consider online search engines.
## END NYT QUOTE

While this is a slick metaphor and a killer lead for an op-ed piece, the implied comparison is inaccurate and deceptive. One can understand why they took this approach, though, as after some adjustments to bring it closer into line with reality, that lead just wouldn’t read as well:

Imagine if about half a dozen major companies, plus a bunch of smaller ones, all had competing versions of the card catalog of every library in the world, but one of those versions had become a lot more popular than any of the others.

Loses a little impact that way. “Doesn’t pop,” as one of my journalism professors was wont to say.

Let’s think back to the old days, say 1998 or so. It pretty much sucked to be a search engine. You didn’t get to lock your customers in with proprietary data formats, your service existed to send them to other people’s sites — not to keep them at your own site, viewing ads and generating revenue for you — and worst of all, you didn’t actually own or even control the data that made your business possible.

Any programmer with a good idea, decent skills, and a lot of hard drives could put together an offering out of their basement that could compete with you. So began the great migration from “search engine” to “portal” (or in some particularly marketing-influenced cases, “vortal”). Search became almost secondary as sites added email services, their own content, discussion groups…anything to get people to come to their site rather than someone else’s, because everybody knew that “search” by itself was something that anybody could do, and not enough to really attract users.

1998, of course, was also the year that the Google beta popped up on the Web. One input box, two buttons, and nothing else. Why would anyone possibly go there, when Yahoo offered search plus a whole bunch of other stuff? Because little by little people started discovering that Google — run by a couple of guys on a couple of machines — was better at searching the Web than Yahoo. People found the stuff that they were looking for more quickly and easily.

But back to the opinion. With a nice rhetorical shift (simile now, not metaphor), the authors bring up their concerns about Google’s methodology:

## BEGIN NYT QUOTE
Google’s use of links to find content essentially turns the Web into the world’s biggest popularity contest – and just as in high school, this can have negative consequences. Google’s great innovation in online searching, and the main reason it is so successful, is that its technology analyzes links among Web pages, not just the content within them. Behind Google’s complex ranking system is a simple idea: each link to a page should be considered a vote, and the pages with the most votes should be ranked first. This elegant approach uses the distributed intelligence of Web users to determine which content is most relevant.
## END NYT QUOTE

The authors’ concern is that “popular sites become ever more popular, while obscure sites recede even further into the ether.” While there’s some degree of validity to this concern, the impact of this is somewhat lessened because search ranking isn’t actually much like a high school popularity contest.

The fact that the Alien Abuctions Incorporated site holds the top Google spot for searches on “alien abductions” means that a link from AAI to a site about aliens would probably bounce that site up in Google’s rankings. A link from AAI to a site about gardening? Not so much effect on the target site — AAI’s “reputation” within one sphere of information doesn’t carry over to other spheres. Unlike high school, there isn’t one group of “cool kids,” but rather a virtually infinite number of topical cliques, each of which has reputation and influence within their own sphere.

Perhaps more important, though: the reason that this approach is widely used right now is that it works better than anything else that we’ve come up with. Rather than having an infinite number of monkeys that sit around trying to classify and rank all of the content on the Internet, PageRank and its associated programmatic variations and imitators allow the content to do most of the work itself.

But let us not forget that what Google did to Yahoo, Yahoo did to Alta Vista before them. That fundamental suck factor of search engines — the stuff that you’re searching is accessible to everybody and their mother — is still out there, and the cost of entry is still pretty low. All it takes is a startup with a platinum card to buy the hardware and a few really smart people who can deal with eating ramen three times a day for a couple of years.

That’s not to say that it’s inevitable, but it is easily possible. Unlike, say, Microsoft, it costs Google’s users nothing to switch to using someone else…they just point their browser to a new URL and Google’s ranking in the real world drops just a little bit.

Feed Splicing, Shell Scripts, and the Internet

Standard

Being an Assortment of Vaguely Related Thoughts

The handful of you who subscribe to the feed will notice that yesterday I started taking advantage of FeedBurner’s feed splicing capability. I set up an account with del.icio.us, dragged a couple of bookmarks into Firefox’s nav bar, and checked a box in my feedburner preferences…total time investment about two minutes.

The result is that now when I hit something interesting on the Web I just click my “post to del.icio.us” link, type in my notes about the page and that link is automagically added to my Web-based del.icio.us bookmarks; even better, though, is that FeedBurner then grabs that content every night and splices it into the seamonkeyrodeo RSS feed that you know and love. A cool idea, easy to set up and use, and exactly the sort of thing that companies like FeedBurner should be coming up with — my RSS feed just became more useful because I use FeedBurner.

Now on to the tangents…

I particularly like this because it fits in well with the way that I do things. One of the dumb little reasons that I stay with linux boxes on the desktop at home and work is that I can keep a command line shell open all the time. Whether I’m at home or at work, I can just type “note” into that shell and my machine dumps whatever else I type into a file, adds a date/time stamp, and (once I’ve finished the note) shoots that file over to another server where it is added to all the other notes that I’ve made over the last few years — all accessible to me via a Web page.

If you’re reading this on the Web, you may have noticed the “Notes” section in the sidebar…same idea: I note down random thoughts and a script just converts them to a .js file that gets pulled into this blog.

This is all the sort of stuff that reminds me of why I think the Internet is cool…I can easily fling information around, slice and dice it, and present it or not to the outside world. There are ongoing discussions in various places about how weak Web browsers are as content creation tools; I suppose that’s because many people are concerned about the technological barrier to content creation on the Web being too high…in some ideal worlds, I believe, all content is on the Web and everybody both creates and consumes with a Web browser. Probably not a Microsoft Web browser.

What’s particularly great about stuff like FeedBurner’s feed splicing is that it’s addressing this issue of content creation in a very different way; by taking advantage of the basic interconnectedness of the Web, using that interconnectedness to make it easy to combine information that already exists somewhere, things like the FeedBurner/del.icio.us combination get past the idea that everyone must be able to easily create HTML documents in order to “contribute” to the Web. I’m not against Wikis, nor the slick little blogger interface that I’m using to make this post, but I am really glad to see people looking at…well, “networked content,” perhaps…in a different way.

Bugmenot.com down…but why?

Standard

Anyone who happens to use Bugmenot (or its excellent Firefox/Mozilla plugin) to skip over the multitude of registrations required to access content on many Web sites will notice that the site’s not there anymore. Usenet suggests that it’s been down since Tuesday, but I haven’t come across any reliable information on why.

There’s a machine sitting there that looks like somebody could be restoring from bare metal and backups following an ugly hardware mishap [yes, that’s why you don’t put your coffee cup down on top of the server], but it could also be due to the appearance of “legal pressure” of the sort that the operators have been concerned about.

Anyone got more information?

Update: According to a post on Mozillazine, bugmenot’s hosting provider pulled the plug, and they’re looking for a new home. I don’t expect that it’ll take too long for them to find one. Bets on how long before the site is back up and running?

Lemons and Lemonade. Pulpy Goodness.

Standard

Three days ago I was in the Garden of the Gods, just outside Colorado Springs, climbing around on incredible rock formations and enjoying the sun.

Today? Fate takes its cut. Four days before we planned on a scheduled “real world learning experience” changeover from our production database to our standby, we had to make the change to our standby. Like, now. And since we were rebuilding a nice, clean standby for the scheduled change, we had recently blown away the standby that was keeping up very nicely, about three minutes behind production. And the new standby was still 24 hours behind when production went all pear-shaped. And that was fucking excellent.

Not the sort of day that one would choose to have, were one given a choice about such things, but I’ve said it before and I’ll say it again: I work with some really good people. We’ve long since been up and running, and it’s starting to look like our tests hold true: the standby machine is doing an excellent job…possibly better than the much more expensive production box. Perhaps that’s something that we’ll discuss with the production machine’s vendor in the near future.

Anyway, the point is: we were given lemons, so we made lemonade. Actually, nobody gave us any sugar or water, so I think that what we did was pound the crap out of some lemons and then laugh maniacally as the juice splattered everywhere. Though maybe that was just me. Whatever.