Babystep documentation system almost ready

Wow, I’m up to the step where I show what the diff output looks like. OK, that predisposes that I turned the current and previous babystep code into text files. So, it’s time to fire up the FileSystemObject and the TextStream object again. I made heavy use of them in the first half of the project, but mostly for reading. This time, I’ll be opening files for writing, and then they will be immediately read back in. After read in, there will be an object in memory that represents the new content for between the babystep tags in the current post. And as we’ve done recently, we will use the RegEx object to replace the babystep tag pattern match with the new content. And the resulting content gets updated back into the table, and voila! The application will be done.

Right now, both halves of the project have lots of output. It’s all really just debugging output. When I combine the two halves of the project, the output will actually be made invisible. Instead, it will reload the same discussion forum thread you are currently looking at, but will force a refresh. The process will not be automatic at first, so that I can retroactively apply it to discussions that already exist. Think clicking a “babystep” link over and over until the program is fine-tuned. It will be safe to re-run, so if there’s still adjustments to be made, no real damage is done. I think I’ll make a backup of the table beforehand just to be on the safe side. And this same system can be used to add a program code colorizer and beautifier if that ever becomes a priority.

This will be absolutely fascinating to watch how it will affect my work. It is central to the way I work, and plan to maintain my focus and stay engaged. The best way to learn is to teach, and this is the best way for me to teach. It forces me to be systematic, and allows me to review process. It creates a play-by-play archive of a project, recording my thoughts at different stages. It will help other people when they work on similar projects, and will help me by allowing review and feedback by my peers. I’m sure professional programmers will cringe at most of my VBScripting work, and liberal use of global variables to avoid parameter passing. But these initial projects are not about purity. Neither are they about long-term manageability. They are about breathing life into a site quickly and starting to build an excitement level, that will justify me switching to my ideal platform, at which time I will be going through some very interesting learning processes, and documenting it all here in the babystep style.

Process is an important characteristic of a project that rarely gets proper play. Programmers don’t like to reveal their follies, and the book publishing model taught us to be efficient with our examples. Rarely would you re-print an entire program example to show how just a few lines of code changed from one page to the next. But that’s exactly where the value lies. I can’t count the number of times I looked for code examples on the Web, and had difficulty viewing the code out of context. Seeing it built up from scratch, especially when you go in steps of just a few lines at a time, can make programmers out of even the slowest learners. There is a reason for every line you put into a program, and those reasons get lost because the process flow gets lost. After awhile, it just becomes the finished product and you loose the sense for how you got there.

Wow, this post about though-process on the babystep tutorial system was going to go in the internal system, but it provides such insight to the HitTail site, and the type of content that’s going to be found here that I think I’ll add it to the HitTail blog. I am also thinking about actually putting out the tutorial of the birth of the babystep tutorial system. I like the way that it is so self-referential. I will use the babystep documentation system to show the evolution of the babystep documentation system. It’s all very circular. Some of the best systems are circular, and self-referential in this way.

Finding Your Longtail of Search

Search engine optimization, as most of us know, is too complicated and mysterious to ever become mainstream. Yet, it must because of how much of a disproportionate advantage it gives to those who get it right. In advertising, you might spend millions on a Super Bowl commercial. In PR, you might get mentioned in the NYT or WSJ. But in SEO, you get that top result on your keyword day-in and day-out, every time anyone in the world searches on that term. And that is too important to ignore.

Pervasiveness within the natural search results makes or breaks businesses. When the rules change and positions are lost, you can often hear cries of foul play. The wounded can launch into conspiracy theory regarding forcing AdWord participation. John Battelle picked one of the may examples of such people for his book, The Search. Despite initial resistance to pay-search, in the form of GoTo.com, it ultimately succeeded because of a very clear value proposition that the media buyers who control marketing budgets could understand. I pay x-amount. I get y-listings. It’s just like advertising. Not so with natural search!

The rules of natural search optimization are always in flux, and there’s something of an arms race between spammers and the engines. Engines will never fully disclose how to position well, or else spammers will be able to shut out all the genuinely worthy sites. So, the trick for the engines is to always reward genuinely worthy sites, and the most important objective for any SEO is therefore to make their sites genuinely worthy.

This concept of genuine worthiness is likely to stay around for a long time, because of how readily trust for a search provider can be broken, and how easy it is to switch. Think how little actual investment or commitment you’ve actually made to a search site. It’s not like you paid anything, or have any financial investments. Resultantly, search providers are uniquely vulnerable to the next big thing, which can come along at any time, prompting legions of users to flock away to the latest golden-boy darling site. It happened with AltaVista and Lycos, and could easily happen today, even with the 800-lb. gorillas-of-search. Yes, I firmly believe that the concepts of trust and the rewarding of genuinely worthy sites independent of advertising are here to stay. So, any company looking for that extra edge is obliged to look at their options in natural search. Enter HitTail.

So, who determines whether a site is worthy? What actions can you take to ensure that your site is worthy by today’s criteria and the unknowable criteria of tomorrow? Craig Silverstein, one of the Google engineers who makes the rounds to the search engine conferences once stated that Google’s main objective in search results is not in fact relevancy. It’s making the user happy. Happiness is the main goal of Google. And a lot of efforts are going along these directions by integrating specialized searches, such as news, shopping, local directories, and the like into the default search. There is also personalized search, which makes the results different based on your geographic location and search history. So, things are changing rapidly, and there are many factors to consider when you ask what makes a site worthy. When everything is mixed together and regurgitated as search results, what is the single most important criteria affecting results that is unlikely to change over time? That is where HitTail is going to focus.

Exactly what this most important criteria? Quality is subjective. Anything can be manipulated. Old-school criteria when AltaVista and Inktomi were king relied mostly on easily manipulated on-page criteria, such as meta tags and keyword density. Google’s big contribution is PageRank, which looks at the Internet’s interlinking topology as a whole. It’s a model based on academic citation system in publishing papers. The result was a broadening the manipulation arena from single pages to vast networks of inter-related sites, wholly intended to change that topology to indicate things that weren’t true. Today, the engines sprinkle in many criteria including fairly sophisticated measures of which sites were visited as a result of a search, and how much time was spent there. The engines also subtly change how the various criteria are weighted over time, which keeps all the manipulators scratching their head, wondering what happened, and spending months responding.

This way lies ruin. At what point does the effort of manipulating search results become more expensive than just buying keywords? For most companies, it’s a no-brainer. The only thing trusted less than the search engines are the snake-oil salesmen claiming to be able to manipulate those results. Why risk getting a site banned? Why invest money in something that may never pay off? I could not agree more. SEO as it is known today is too shadowy and adversarial to ever become a mainstream service, and therefore a mainstream market.

So, are you going to let your competitor cruise along getting that top natural search result, while you’re relegated to pay and pay—and even engage in competitive bidding frenzy to just hold your position? Of course not! And therein likes the rub. It’s a Catch-22. There’s no way out. Pay for keywords, or enter that shadowy realm.

How do you get your natural hits today and have insurance for the future, no matter how things change? The answer is in the latest buzzword that’s coming your way. You’ve probably heard it already, and if you haven’t, get ready for the tsunami of hype surrounding long tail keywords. The term “long tail” was apparently coined by a Wired writer, and has since been adopted by the pay-per-click crowd championing how there are still plenty of cheap keywords out there that can pay off big. The long tail SEO concept, as applied to paid search, basically states that the most popular keywords (music, books, sex, etc.) are also the most expensive. They’ve got the most traffic, but also the most competition. But when you get off the beaten track of keywords, they dramatically ramp off with how expensive they are, and the list of available keywords in the “long tail” of the slope-off never runs out. That’s right—as keywords get more obscure, they get cheaper, and although the overall traffic on those keywords goes down, the value of the customer may even go up!

So, the long tail of search has a very clear value proposition as applied to paid search, which today is principally Google AdWords, and Yahoo Search Marketing. What you do is ferret out those obscure keywords (through WordTracker, your log files and analytics, and brainstorming), run cheaper campaigns, pay for fewer clicks, and win bigger when they convert. The problem in doing this in the paid search arena is the work that goes into identifying these keywords, and migrating them over into a campaign is so complex. Traditional media buyers and the average person working in a company’s marketing department couldn’t handle it, so the work has been outsourced to search engine marketing firms (SEM), making a yet another new industry.

But Google automates everything! Can you imagine tedious human busywork standing in the way of increased Google profits? So, why not just automate the process and let everyone automatically flow new keywords into an ad campaign and automatically optimize the campaign based on conversion data? Just write an app that figures out the obscure keywords in your market space, and shuttles them over to your AdWords campaign! Then, drop and add keywords based on how we’ll they’re converting. Before long, you have the perfectly optimized paid keyword campaign custom tailored for you. You can even do this today using the Google and Yahoo API’s and third-party products. But it is in the engine’s greatest interest to make this an easy and free process. This, I believe, is why Google bought the Urchin analytics and made the service free. Watch for some big changes along these lines, and for the still-new industry of SEM to have its world rocked.

And so the stage is set for HitTail. Paid search is being fine-tuned into a money-press, but natural search is too important to walk away from. Yet, constant change prevents products to improve natural search from becoming mainstream. Therefore, the best deal in marketing today—pay nothing and have a continual visit of qualified traffic—is unattainable to marketing departments in companies around the world. They are shut out of the game, because when researching it, they get conflicting information, encounter a shadowy world, and get constantly corralled back to the clear value proposition of paid search. This has created a potential market whose vacuum is so palpable, that it’s always right at the edge of consciousness. It is a very sore pain-point that needs relief. It causes anxiety in marketing people whenever they search on their keywords and inspect the resulting screens.

Yes, HitTail proposes to relieve that anxiety. The way it does this will be so above-the-table and distant from that shadowy world of SEO that I believe when the Google engineers inspect it, they will give a smiling nod of approval. For, HitTail will be automating very little, and it will be misleading even less. It will, quite simply, put a constant flow of recommendations in your hands to release the potential already exists. If your product or service is worthy of the attention you’re trying to win, from the market you’re trying to serve, then we will help you release the latent potential that already resides in your site.

HitTail is a long tail keyword tool that will help you tap into the almost inexhaustible and free inventory of relevant keywords that fills the long tail of search, so that you can get your keywords for nothing, and your hits for free.

Getting my day started

So, it’s 11:00AM, and I’m really only just getting started for the day. That’s fine, because I went until 1:00AM last night, and made such good progress yesterday. Also, today is Friday, meaning I can go as late and long as I want to without worrying about tomorrow. This can be a VERY productive day. I lost an hour and a half this morning trying to update my Symbian UIQ Sony Ericsson P910a phone with the latest iAnywhere Pylon client sync software. I convinced our IT guy to upgrade, so I could get the newly added features of syncing to-do items and notes—something I got very fond of with my old Samsung i700 PocketPC/Smartphone.

The iAnywhere instructions say that I need to uninstall the old Pylon application at very minimum, and better yet, do a hard reset. Only two problems: the Pylon software doesn’t show in the uninstall program, and the process for hard resetting a P910a is some sort of secret. You can find instructions on the internet that involves removing the sim card and doing a series of presses, but it doesn’t seem to work. Anyway, I did a restore to undo any damage I did this morning, and decided to get back to MyLongTail.

I’m sitting in Starbucks right now. I can’t find a free and unsecured WiFi connection, so I’m working offline right now. I am considering one month of T-Mobile hotspot access, and I see that they offer a special deal for T-Mobile voice customers. But I don’t want to put my credit card information in on a WiFi network, so I’ll do my thought work here, return home when I’m done my coffee or the battery starts to drain or I finish my thought-work, whichever comes first.

The marker upper program that I wrote is just awesome. I think I’ll be able to crank out babystep tutorials in a fashion and at a rate that is unrivalled on the Internet. Indeed, MyLongTail may start to become known as a programming tutorial site. But I’ll have to maintain a separate identity for that area of the site, because I don’t want to scare away people who are just there for the main feature—a clear and easy to implement natural search optimization strategy. It’s more than a strategy. It’s a play-by-play set of instructions.

Long day

Well, it’s just about 1:00AM, and I spent the majority of today on the marker upper project, which is just fine, because I’m thrilled with the progress I’ve made. All the logic is done, and now it’s just a matter of integrating it with my home-grown CMS system. The beautiful part is that it’s 1:00AM, and I started early this morning. So, the project is taking on momentum. It is easily as interesting as anything else I could be working on, which is key to managing distractions. As long as the main work is more interesting than any of the distractions, then the distractions have no power.

It’s obvious that blogging will become a distraction, but as long as it keeps me on-course. The marker-upper project for superior baby-step tutorials will actually help me work my way through the MyLongTail project. Some pieces will become public and be posted here, but others will not. Yet, I still plan on using this tutorial method of building up the apps. My SEO team at Connors is the audience for those tutorials, but I will endeavor to make as many parts of it public as possible.

Yes, I didn’t get to the other two projects that I had hoped, but this is only the first day of what I hope will be a 4-day focus-spree. I will have to do some other work over the weekend, but for the most part, I am going to try to make MyLongTail into something that can start getting some excitement going. At the very minimum, I need to start collecting contact info of people who would like to start testing it.

I’ll be looking for early adopters. I think of this much like GoTo in the early days. A lot of people didn’t get it, but GoTo laid the framework for AdSense, and what analysts are saying is now over a 5-billion dollar market. I don’t think MyLongTail will be as gangbusters as paid search, because it is to paid search what public relations is to marketing. I’ll be posting much more on those topics, drawing the parallels between the “unpaid” promotion aspects of both PR and SEO. You might even call what MyLongTail intends to accomplish as PR via Search. Or perhaps “search releations”.

Web 2.0 and Lifestyle 2.0 in NYC

I’ve lived in NYC for over a year now in this new job at the pr firm, Connors Communications, but I have hardly gotten out to see the city. It’s my own fault, but now that this MyLongTail project is becoming center stage, it threatens to swallow me up, and yet I want to get out and become a real New Yorker more than ever. So, I decided on a creative solution.

I’m not much for the bar and nightlife scene, but I am a fiendish coffee drinker. And this topic for a blog post is just a silly tangent, but I want to create the blog post to add some color and commit myself to this project. I have a plan. It addresses getting rid of distractions that threaten the MyLongTail project, forcing myself to get out and see NYC a little more. I’m one of the schmoes paying over $200/mo. for cable TV, plus premium channel, plus high-speed Internet, plus PVR. I have a laptop, but can’t reliably get onto the Internet when I’m walking around, because so many of the strong WiFi hubs are pay services. And I live on West 16th Street, not far from Avenue of the Americas, so I’m probably pretty close to a pay-service hotspot. I don’t really watch that much TV, and prefer buying the DVDs anyway, or using BitTorrent to pull down the latest recordings, which I only need the high-speed Internet connection to do.

My first inclination was to replace the $200/mo. charge with $15/mo. Verizon DSL, which seems to be the big deal right now. This would be contingent on being able to get Internet DSL without phone-service. From my Googling, it appears Verizon is being forced to offer that. I’ve gone wireless with phone using T-Mobile, because it gets me a decent voice plan and unlimited downloads for $60/mo. So, I’m already paying a pretty penny for phone and data. I see no reason to pay an additional $200/mo. for Internet. While the Verizon choice would  be economical, it wouldn’t turn me into that wireless warrior that I want to be, so I can roam from Starbucks to Starbucks during the course of the day, taking in NYC.

A little background on why that’s important. After over a year of a truly integrated lifestyle, living 6 blocks from work in the Chelsea section of NYC, I gained back almost 2 hours per day by getting rid of the commute, I find that I lost something of an inspired edge that I used to have. I isolated it down to the nearly hour-long car drive commute, where ideas were flying around in my head, processing at what must have been a subconscious level. When I sat down to do a project, it was almost like I had already discussed it in-depth (similar to this blogging). This applied to when I got to work, and when I got home at the end of the day (yes, I am a workaholic). But now, with my new integrated lifestyle, by going directly to the office environment with the distractions of the daily grind, home to the distractions of TV and cats, I lost that edge. I need to get back that edge pronto. And I might as well start taking in a little more of NYC in the process. We’re in the middle of winter right now, but it’s been unseasonably mild. Such walks will be invigorating, healthy, and provide good stopping points in which I can subconsciously process ideas, while motivated to my goal of feeding my caffeine addiction, which I will be better able to afford (even at Starbucks), having given up $200/mo. TV.

Unlimited national T-Mobile hotspot service with a 12-month commitment is about $30/mo. That’s way better than $200/mo. Time/Warner cable plus RoadRunner, but its contingent on me being able to pull in that signal from home. I may even take it on blind faith that I will be able to, and will either buy a fancy WiFi antenna, or follow one of the many instructions on how to build one from chunky soup cans or Big Boy tennis ball cans. Hey, I’m in Manhattan, and if you can pull in a T-Mobile hotspot from home, then you can do it here.

This blog post should also point out some points about corporate blogging strategies and SEO. Both quantity and quality of posts counts when it comes to SEO. Typically, you want to keep your posts on-topic to your site. But the occasional divergence, including humanizing the blog, spices up the average distribution of keywords that your site is targeting. While I’m not trying to attract hits of people looking for T-Mobile or Starbucks, these are both popular mainstream topics, which when mixed with all the other words mentioned on this page, helps to kick start the MyLongTail formula, which you will be learning much about in short order.

This post is also a good example of Web 2.0 thinking, how services can be mixed and matched to suit your customized need (be them XML Web services, or phone or Internet service). It has little to do with the service provider’s intended need. We’re free to mash it up as we like in order to pursue increasingly individualized lifestyle choices or program applications. Yes, as Paul Graham points out, Web 2.0 is a contrived buzzword to justify a new conference. But it wouldn’t have become so broadly adopted if it didn’t strike a fundamental chord. Like Reagan telling Gorbachev to tear down this wall, O’Reilly and Battelle are telling developers to tear down the walls around walled gardens of service.

I actually went ahead and posted that idea for a theme on John Battelle’s search blog, but I didn’t link it back to this post, because I’m not quite ready to be found by the spiders yet. Though, even providing this link out could start the process, because John might be running the Google Toolbar with privacy turned off, and look at his referrers. He also might have his log files or analytics reports findable by Google, leading a path here. Anyway, that just pushes me on with all the more urgency to my projects.

The 80/20 Rule and Nested Sub-projects

OK, I’m starting on these 2 projects, but I’ve got the documentation bug. These are two very mainstream projects, that would be of great use to the world at large. And I’m going to build them up from scratch, using nothing more than what’s already installed on most people’s desktop PCs. So, I want to document it with my baby-step technique, where I show the entire program as it develops on every page of the tutorial. It’s quite inefficient from a file standpoint. But if the CMS system makes it manageable, there’s really no harm. It is the Web after all, and you don’t have to kill trees for more pages. And if it makes the student’s experience better, it’s worth it.

But that leads to a nested sub-project, and the issues of whether or not to do it. I am a big believer in the 80/20 rule that states you should plan to get 80% of the benefit of any endeavor from the first 20% of the work (there are other interpretations for tax purposes, etc.). So, a series of half-implemented projects has a net gain and still moves you forward. Nested sub-projects which plague many professions are the enemy of productivity and the 80/20 rule. Suddenly, you’re embarking on project after project before you even start on the first thing. I even wrote an 80/20 rule poem.

You saw an example of me avoiding a nested sub-project pitfall when I decided to just go ahead with Blogger. I could have tried installing WordPad, had another server and database to deal with, a new system to learn, etc. I could have even written my own (which I have partially working). But by choosing Blogger, I could just forge ahead. And here I am the next day with many blog posts under my belt and standing at the edge of another potential pitfall, looking over the edge. Let me explain.

I can already do baby-step tutorials using my CMS. The problem is that to make them really cool, you have to highlight what lines changed in the code from page to page. Manually inserting the markup is tedious, and defeats the purpose. It can and should be automated, and I already used a Windows derivative of the Unix diff program to get started. Its half-way implemented. I basically make a post using my home-grown blogging system, and it looks a the previous posts, finds the differences, and automatically inserts highlighting and strike-out code to show on the current post what changed from the previous post. It’s way-cool, and the foundation for something new and innovative on the Internet in its own right. I could easily imagine this site becoming as popular for the novel baby-step tutorials as it does for the MyLongTail app. Problem is, it’s only half-implemented, and I don’t know whether I should try to bang it out all the way before today’s main projects.

Let’s evaluate. Ask the key questions…

  1. Can you possibly imagine even more nested sub-projects? Or is it a clear one-off? Will you get caught in a recursive trap and a nightmare of maintenance overhead?
  2. Is it foundational, meaning that it will improve all the other work you do and start resulting in compounding returns? So, over time, do you really save time?
  3. Is there a better already existing way to do this?
  4. What are the corollary benefits, and do they outweigh the lost time?
  5. Are there urgent aspects of the projects you’re putting aside? What is the damage of delay?
  6. Is it really necessary for your primary objectives?

This should be a clear one-off project, because it is basically a rebound-action on a database insert. When the insert occurs, just do this quick processing. The auto-markup occurs.

It is foundational, but only on the documentation side. If you consider documentation foundational. Its way different than other documentation systems in that it captures process, for better or for worse. It is much like the Disney work-in-progress pencil-sketch test animations that have more character than the finished product. It also gives you more ability to learn about the animation process than the finished product. It is rare to the point of non-existence in the programming world, because it takes too much time to document in this way, and reveals the many imperfections of the creative process (because it documents mistakes and all). The closes thing to this is Wiki revision tracking, and code version management software. Anyway, all this is to say, yes, I believe the work is foundational, because it is key to providing a rich documentation and tutorial experience on this site.

This gets to the fact that I already made the decision to use my home-grown CMS system. With that decision made, I need to choose something that integrates well. And I actually already am choosing “the better way” to do this in tying in the Unix-derivative diff program. I could have attempted to actually program this from scratch, but this program gives me everything I need to parse a file and insert the code. I can focus on parsing and marking up instead of detecting differences.

There are massive corollary benefits. It allows me clearer differentiation of what I document in Blogger, and what goes in the CMS (stream-of-consciousness goes in Blogger, and baby-step code tutorials go in CMS). The tutorials increase coolness factor and buzzworthiness the chances of getting this site written about, and eventually SlashDotted. That is not only a corollary benefit, but is a main objective. The project also clarifies my own thinking, making me code more carefully knowing that the non-proprietary parts are going to be public and under the scrutiny by other programmers.

Yes, there is a very urgent aspect of the projects I’m putting aside. I want to document the very first search hits to ever occur on MyLongTail, and the very first GoogleBot visit. I may miss them. The site is already out there, and I’ve been blogging (but without pinging). Will the site be that much less interesting if I miss these key events? If I can really isolate the projects (all 3) down to a single day, will I be really jeopardizing it that much more? I don’t think so. So, all three projects should be done in one day. But one of those projects is really less important than the others. More on that soon.

While not necessary to my primary objectives, it certainly does help the “erupt in buzz” objective. More tutorials mean more pages of real value to a broader audience, and more search optimized pages, and more pages using a unique and valuable tutorial technique that perhaps the buzz-brokers will recognize. It’s also worth noting that I already have the blogging bug, which is somewhat cutting into just getting the work done (its 11:22AM already). And having this system running will let me feed that blogging appetite while simultaneously actually doing the coding. So, while not necessary for my primary objectives, it highly reinforces them, and I will move ahead with the baby-step tutorial marker upper.

VBScript in a Web 2.0 World

Well, what about those 2 projects? To help this site erupt in buzz, I’m going to also make it into a tutorial site, focused on a unique brand of tutorials that I can’t get enough of: baby-step tutorials. Now for a bit of philosophy. I program in chisel-strike projects—projects I can conceive of and implement in a single day, while it stays consistent with the overall vision of the project. I was inspired by the way Michelangelo once described his work as revealing the sculpture that was already hidden in the stone. Every chisel-strike a master sculpture takes reveals more of the masterpiece contained within. It reaches a certain point where it’s clear what the sculpture is, and it becomes a joy to look at, and could already be put on display.

That’s what all these “beta” sites that are in beta for years are about (in addition to reducing tech support liability). There’s no reason to wait for the pristine and polished finished product before you start getting the benefit. There’s lots of ways to describe this. I use a chisel-strike metaphor. In programming, there used to be a lot of talk of spiral methodologies replacing waterfalls. Recently, talk of agile methodology has come into vogue. Some would call it hacking. But whereas hacking in yesteryear resulted in a working app at the expense of long-term manageability, hacking today can very easily result in the same working app, but on top of a robust framework that “un-hacks” it. Ruby on Rails is an example of such a framework.

But many of the chisel strike projects I start out with are going to be VBScript. That’s right. I’m building this thing is ASP Classic on IIS. I’m doing it knowing it will be on the Ruby programming language for the back-end scripts when I have the time, and the Ruby on Rails agile framework for the front-end user interface and applications. What? ROR is supposed to be so ridiculously simple that you can sit down, install it, and have written your first app in under an hour. It’s true, and I’ve done it. But several factors affected my decision to move forward with VBScript.

First and foremost, I too am doing an extraction of an existing system (the way ROR was extracted from Basecamp). I don’t like my extraction as much, and I’ll never open source it. But it exists, and it’s my fastest path to implementation. Second, once I make the move to ROR, I think it will be time to break all my Microsoft dependencies, and get off of SQL Server. I love SQL Server, and think it’s tweaked-out in terms of transactions per seconds, self-optimization, and disaster prevention in a way that MySQL is not (yet). It is increasingly an acknowledged competitor to Oracle and DB/2. But scalability has a lot to do with cranking out multiple software instances of your system at little to no additional cost. That means being in the open source world.

It will also lower operational cost and maximize profits. Yes, there are MS-arguments against this, but they don’t hold up over time as there are more and better means of supporting open source installations. And I don’t know Linux/Apache yet. So, no matter how simple ROR may be, I will be taking it as an all-or-nothing package. I don’t want to create a hybrid of keeping a Microsoft platform, but installing MySQL and Ruby. Even though it would be a great learning experience, it would slow my initial speed of deployment. The benefit for you as an audience is to see someone still doing viable Web 2.0 work on VBScript/ASP Classic, with a plan to move to Ruby on Rails, and then whatever tutorials I create during the transition. If my plan goes well, it should be a series of baby-step tutorials that will help anyone make the move.

Blogging, Continuity and Productivity

One way a blog like this helps when designing a new Web 2.0 site is continuity of discussion. I’m working on this project primarily as a one-person show. I have some great backup in the programmers we have working for us at Connors back at the office, and I have my long-time partner in crime who helped with previous incarnations of the system. But this blog constitutes a real-time, ongoing discussion with myself, and lets me pick up where I left off smoothly.

There was an article I read a few months ago about productivity in programmers. I forget exactly where, but I think it was when I was researching agile methodologies, and the author made the point that a single programmer with a clear vision of what he/she is trying to do can be something like 1000% more effective than a programmer working on a team. That is, one motivated programmer using agile development methodologies can do 10x more work than a counterpart working as part of a team where project management software, bureaucracy and meetings constantly corrode the hours spent to work accomplished ratio. It wasn’t Paul Graham who write this, but somehow I associate the concept with him, based on how it jived with the many articles I’ve read on his site. If I find the actual reference, I’ll post the link.

The purpose of the blog posts in the morning is like winding the catapult. I should have clarity on the rest of the day. Yesterday, I made the HitTail site live. I essentially made the decision to develop this live online in stealth mode. This has the SEO advantage of letting the clock start ticking as soon as possible to let the domain age as far as the engines are concerned. The latest Google wisdom following the Jagger update is that a domain should be about a year old to overcome a negative weighting penalty. Most spam sites are newly registered domains. There’s some uncertainty about when the clock starts ticking—whether it’s when the domain is registered or when Google discovers it for the first time.

GoogleBot is unlikely to discover the site until at lest one inbound link is established to it. But several PCs I use have the Google Toolbar with privacy turned off, so Google will know about the existence of these pages very soon (if not already). But I want this site to chronicle a complete and accurate history of the birth of a Web 2.0 site from an SEO point of view. So, today’s priority is to put the systems in place to track spider visits.

These spider monitoring systems also starts a more advanced process of what this site is all about—collecting data that becomes intelligence that becomes action. HitTail is not going to advocate spider-watching, because that is a misappropriation of valuable time from the average marketing department’s point of view. I’m doing it because it’s of interest for this particular site. When was the first visit by GoogleBot? Which pages has it picked up? How much time went by before the first Google search hit occurred? Yes, this might be of casual interest to marketing departments that have too much time on their hands. But HitTail focuses on “what hits occurred recently” and “how can we use that to make more hits occur soon?” Much of the peripheral and pedantic details will be thrown in the trash to make the overall system more focused and efficient.

Double-whammy Logo Design

The last thing that I want to do today before I go home today (where my kittens who are not used to me being away so late will kill me) me is a unified header to glue together the CMS and the Blogger pages. A single graphical header going across all the pages will go a long way towards unifying the two systems (blog & CMS) and catalyzing my vision as to what the site is to become. Happily, I have a logo all ready. Rarely do I embark on graphics projects anymore, even though that is my training. I’m tired of the subjectivity of graphic design. Everyone is an expert, everything is subjective, and fashion rules.

None-the-less, I dusted off my sketching skills and doodled out a design that I hope my old instructor, the master of ambigrams, John Langdon, who did the work in Dan Brown’s Angels & Demons, would be proud of. My logo is not an ambigram, but it uses the principles I learned in John’s typography class, of how the strongest logos often zero in on a letter that says something about the overall word, and exaggerates it just enough to turn it into a sort of onomatopoeia—a word that represents the meaning. Words like Bam and Sniff are onomatopoeias. It’s so much stronger than just adding the latest swoosh that is so prevalent in logos today.

Ambigrams are double-whammy design, because they work for more than one reason, but it can be done without making it readable upside down. There was once a magazine named Family, which accomplished such design by dotting the “l” and a few other characters. Very effective. When I can, I try to make a logo work for 3 or 4 reasons. I think I nailed that here. First, you’ve got humor: what is “my long tail?” If that’s not an ice breaker, I don’t know what is. Second, you’ve got echoes of the ubiquitous logo that has been burned into all of our retinas: Google. I tried to make the placement of the “g” reminiscent of Google (although, it’s a wholly different typeface). Thirdly, I exaggerated the g so that its tail literally becomes the tail. I could go on, but 3 reasons it’s such a strong logo is enough. I’ve got it up on the site.

OK, so I’ve uploaded the logo and put it at the top of both the CMS template and the Blogger template. I also took the step of unifying the styles from the two systems. That way, I won’t end up maintaining two sets of CSS, and I can just edit a single external linked file to tweak the overall look of the site without re-generating the static pages. It will also help enforce a unified look between CMS and blog.

Building Search-friendly From the Start

OK, now it’s time to apply a graphical header across both Blogger and the main site. And it’s time to make some commitments to a CMS system for the main site. There are many CMS systems out there, and the last thing I’m going to do is go through the learning curve on even an easy one. Is a website a Web application or a bunch of HTML files? For manageability, it has to be thought of as a Web app, but for search optimization, it needs to be thought of as HTML files.

Blogging software has long ago solved this by “outputting” static HTML files from their database. This has a plethora of advantages, including reducing server load (serving static HTML files is much easier than executing code). Even if your dynamic pages are masquerading as static HTML, you’ve got increased server load—now two-fold: first, from the invisible reformatting of the URL that takes place with the Mod_Rewrite technique, and second by executing code that probably queries a database, populates variables, then finally serves up the page. Static pages, while providing less customizability, are much better for high volume sites.

I believe I’ll be using our own home-grown CMS system for the rest of the MyLongTail site. The back-end controls don’t have the features or the polish of other CMS systems, but I know it inside and out. It gives 100% uncompromising artistic control (unlike most CMS), and it creates pages that are perfectly optimized static HTML for search engines. And best of all, when things change on the Internet, I can just re-work the XSL transformations, and appease the search engine algorithms du jour, at least as far as internal link structure is concerned. Our home-grown CMS system was designed specifically with SEO in mind, and more particularly, with non-commitment to website architecture or technology decisions. Very advanced XSL queries “knit” the website together, very much the way blogging software can rebuild the static pages of a blog. But because we control that transformation.

Anyway, I need to go through the steps I would take for any website using our CMS for SEO system. I will need at least one page on the site. From a scalability standpoint, my home grown system is great when the entire site is going to be HTML. But much of this site is going to be an application. So, while I’m starting it this way for expediency’s sake, I very well may switch over to Ruby on Rails for an SEO-friendly app site. Additionally, much of the application will be written with AJAX, which is inherently SEO-unfriendly.

So as the site becomes more application-like and cooler and cooler, it will simultaneously be becoming less friendly to search engines. That’s part of the reason why the blog is so important (Blogger is inherently SEO-friendly). Blogging lets us roll out content in a friction-free environment. Anyone who has managed corporate websites knows what I mean when I say friction. Because I’m blogging from Microsoft Word, I can roll out content with almost no friction. But the content that becomes the navigational framework of the site will be from my home-grown CMS, which is also inherently search engine friendly. Together, the blog and the navigation pages will create a very competent placeholder, so it can start setting properly into the engines.