How Quick Into Google Search Results?
So, how quickly will a brand new domain show up on Google? I should have been checking day-to-day, but today is January 31, and the site was technically launched on January 16th. That's just over 2-weeks, the period of time traditionally quoted as the minimum for a Google sweep. So, now's a good time to do a quick review. Thankfully, the domain name is a totally made-up name, and I can do some very insightful tests here. I decide to search blogsearch.blogger.com first to ensure that at least the blog posts are in the specialized blog-content engine. It produces 28 results, including the first test post made on mylongtail.blogspot.com (now defunct). Every post was made since January 25. So, in one week, every blog post has been included in the blog search. 
Next, I search on mylongtail in the default search, and I see one result. It's the domain with no title or description. This is what's often referred to as the Google sandbox. We can see that Google is aware that the domain exists, but is not producing any of the site's content in the results. We see in the spiderspotter app that the first visit by GoogleBot was January 25th, the same day I started blogging. 
From the 25th to today is exactly 1 week (7 days). In seven days, we have gone from an a previously unknown site to the domain being findable, but collapsed down to 1 page, and no actual page-content in the results. How recent is it? A quick search on a couple of different Google datacenters reveals that even just this 1-page listing is only on a couple of datacenters, and non-existent in others. So I am indeed catching it during the process of propagation, and we have our undisputed evidence that a site can go from zero to listed in some form in 1 week. Have I avoided the Google sandbox penalty altogether? 
And finally, we check for specific quoted content from the first blog post. I know it won't show, but I'm at least doing the text for the sake of completeness. I won't include a screenshot here, because it's identical to the one shown above. So, it's 1-week to show up at all. And it's sometime longer before content appears. After content appears, the results tend to "dance around", nicknamed the "Google Dance" before the data has propagated across all data centers. Another factor affecting the results settling down is something people don't talk about much. The Google patent from March of last year revealed the fact that Google is very sensitive to the amount that a site has changed from one visit to the next. That is to say, how much of the site has changed? How many new links have been established to the site? And when a site is brand new, every few pages you add constitute a significant percentage of the overall site. So, Google is seeing a very volatile site. And the results are correspondingly volatile. Therefore, when a page is first discovered, it goes into what I think of as a moving window of opportunity. I believe they get this extra relevancy boost to see if they have the potential for gangbusters fad-like success. Fad-like success? Fads, I believe, overrule traditional rules of slow organic growth. These are pages that somehow become massively popular and everyone starts linking to, passing around in email, finding due to events in the news, etc. If a page does suddenly become massively popular, Google sees this, because they're quietly recording click-through data, similarly to how DirectHit did back in the day. But DirectHit's system, subsequently merged with Ask Jeeves, was ultimately defeated, because by touting that they were doing this, they invited click-stuffing abuse. Google on the other hand not only doesn't advertise click-through tracking, but they use very clever JavaScript to keep it from even look like it's occurring. It's not evil. It's just smart. And if a site goes gangbusters, there is a totally organic pattern created that is difficult to fake, because there are hundreds of links from non-related sites, and thousands of click-through from disparate IPs that couldn't possibly be under one person's control. This fad traffic pattern then "buoys" that page's relevancy in future searches. This is just speculation based on observations, and only stands to reason that certain relevancy criteria can outweigh each other criteria, that criteria is both particularly difficult to fake and out of balance to the others. Anyway, what are my conclusions? This test proves... - How long does it take to go from zero-to-being in Google results at all? 1-week.
- How long does it take to go from zero-to-being in Google results in a meaningful way? Verdict not in, but expected soon. Stay tuned.
- How long does it take to go from zero-to-being in Google resuts in a stabilized decent and decent fashion necessary to drive sales? Will not know for three to six months
- How long does it take to go from zeron to being viewes as a healthy, growing site worthy of regular, predictable inclusion of new content? Well, that's the purpose of MyLongTail!
What about Yahoo, MSN and Ask Jeeves? Well, for the sake of completeness, Yahoo shows that it's mentioned on the Blogger recently updated pages page, but Yahoo's Slurp crawler didn't hit the site for the first time until yesterday. MSN is listing 6 results, including the blog index page of the MyLongTail site, the original blogspot URL, my Blogger profile, and a Ruby on Rails news digest page. Here are the screen shots of the Yahoo and MSN results. 

Benchmark Keywords Spanning Many Years
This post is about keyword benchmarking for search optimization. After a recent update to the Google search results, nicknamed “Jagger” last November, my personal domain dropped off the first page of results for my name, “Mike Levin”. The photographer of the same name maintained his position #1. My site went four pages in, but a bunch of other pages that also referred to me moved onto the first page of results, including my Blogger profile, the profile on my employer’s site, and my SearchEngineForum staff profile. In other words, I went from holding position #2 with my personal domain that has been around for a very long time, to holding 3 lower positions on the main homepage, but loosing my personal one. Admittedly, I haven’t kept my personal site very updated and the sites linking to it might be someone dubious, since I’ve been in the SEO circles since day-one. SearchEngineForums is not only one of the oldest search-oriented forums on the Web, but it’s one of the oldest Web forums, period. It was started by Jim Wilson, who has since passed. Webmaster World has mostly taken its place in spirit. Over the years, I’ve worked as something of an intrapreneur (rather than an entrepreneur) at companies like Prophet 21 (now bought by Activant), and Scala Multimedia. There have been certain benchmarks over the years that have helped me gauge what was going on with search. Searching on my own name was, of course, one of them. And the Google Jagger update was significant in that newer sites, but not too new, suddenly had an edge over long-standing sites, which you might call stale. But another benchmark I occasionally monitor is the term “distribution software”. It was relatively easy to conquer across all the engines of the time, and has sustained itself remarkably over time. So, it was with great interest that I watched when the new purchasers of Prophet 21, and the awesome 3-letter domain p21.com, forwarded the Prophet site to an Activant third-level domain. I don’t think the third-level domain had been around for very long, but the Activant site had. So, would it incur the sandbox penalty? Would it maintain its across-the-board top positions? Was Activant unwittingly walking away from one of its potentially most valuable acquisitions and assets? The answer is that the Google juice transferred over from www.p21.com to distribution.activant.com very smoothly, at least for the benchmark keyword that I still monitor. The sandbox penalty had been evaded by using a sub-domain of a long-standing second-level domain. If you search on distribution software on Google, Yahoo, MSN and Ask Jeeves, you will find Activant as the VERY TOP result in 3 of the 4 sites, and #3 in Ask Jeeves (which still shows the old domain). This tells us several lessons. Across-the-board fortified results of the sort I achieved (with help from a fellow named Steve Elsner) are transferable. The transfer can occur in a relatively short period of time (a matter of months). A sub-domain can quickly acquire a great deal of clout—probably more quickly than a newly registered domain, given the new Jagger reality. And when I left P21 back in 1999, I left the Web pieces in some very good hands, and someone at Activant took a gamble that paid off and gave me some important SEO lessons for the SEO landscape as it exists at this particular instant. Over time, a great deal of evidence mounts up that such-and-such a site is relevant on such-and-such a topic. These breadcrumb trails (mostly link topology) point back to hardwired domain names. So, changing a domain name is serious business. I have another situation similar to the above one, but the transfer of considerable existing-site clout was to a brand-new domain name. This was December of 2004, before anyone knew newly registered domains were about to have the wind taken out of their sales. Their site appeared in the top results in the Google on their keywords within four months of site-launch, right on schedule and in line with our time estimate to the client. But then it dropped out and didn’t come back. The client cringed. We cringed. We applied about as much “upward pressure” as we possibly could without crossing ethical boundaries. I was convinced we were worse than stuck in the sandbox, because we had the positions for quite some time and lost them. Then, news broke of the Jagger update. I totally understood the reasoning, and did what I always do when that happens. I metaphorically climbed into the head of the Google Engineers and rummaged around in there for awhile and discovered that if a domain was registered specifically for spamming, they would only be registering it for a year. If a site survived over that 1-year boundary, then bam! You’re out. So, I gave the client the time estimate based on the new domain launch. I laid out their options, and the risks of sticking it out or bailing to the old domain name too soon. They took our advice and stuck it out to get past that 1-year point, and it paid off. I nailed the time estimate of when the sandbox/Jagger penalty would lift down to the week. It was 1 year and 3 weeks after they launched the new site. One of my claims to fame in the SEO circles in the early years was my mission to conquer a 2-word keyword combo that landed squarely in the crosshairs of Macromedia, Apple, and a number of other companies: “multimedia software”. I achieved similar fortified results on this 2-keyword combo as I did with “distribution software”, and over the years, it has continued to hover around position #5. And although the term multimedia is so “80’s”, it is also highly competitive—maybe not in bidding, but certainly in how many products I had to push down. So, after I nailed the 2-word combo, I moved onto just the single term “multimedia”. I drove that sucker almost up to page one before I moved onto my next ventures. Also here, I worked the term “digital signage”, which was MUCH easier, since it was a bit more off the beaten track. It has still remained one of my benchmark keywords for taking the pluse of the search landscape. At Connors Communications, my job is really cut out for me. It’s a PR firm, and doesn’t have the ace up its sleeve that both P21 and Scala had—a product and a user base. Yes, a product and user base are two of the most valuable tools for SEO. Because with a product, you can offer a free downloadable version, which triggers of the viral marketing thing like little else. Everyone adds you to the download sites, and you suddenly have both inbound links AND buzz. But you also have a user base who, for better or for worse, are going to talk about you in forums, and blog about you, and link to you on their websites (sometimes other corporate sites if your product is corporate). It’s even better if you have a network of dealers, distributors and legacy users, which Scala did. It was mostly a matter of directing momentum—or as Sun Tzu would say—throwing rocks on eggs. SEO for Scala was quite easy. But Connors is a PR firm, which is a service. By nature, it can only serve a small number of clients at any one time. And no matter how talented the Connors crew is (and they are VERY talented, having launched Amazon, Priceline, and most recently, Vonage), it is still just a PR company without the advantages of a product or installed user base. So what is the hook I can hang my hat on from an SEO perspective? What will my benchmark keywords be for Connors? And how do I leverage all the zillions of search hits I’ll be generating for them with SEO if we can’t take everyone onboard simultaneously as clients? The answer to “what keywords” is “pr firm”, for which we’ve risen to page one in MSN, page 2 in Google and page 2 in Yahoo. This serves as a beachhead for other keyword combos (more on the beachhead concept in later posts), and shows that the methodologies I developed not only are fortified across time (P21 and Scala), but they work across industries. So, the next step is to product-tize an aspect of the PR industry that is exciting to everyone, and can seem in many ways like a downloadable product. Once again, enter MyLongTail.
Added WiFi Hotspots, and Paying Less
So today I joined the ranks of the wireless warriors. I was on a $60/mo T-Mobile plan that got me 600 peek hours, unlimited night & weekends plus unlimited Internet and download (over the phone only). But now with my shiny new Averatec that everyone thinks is an Apple iBook, I have the itch to walk NYC, sitting down in any Starbucks to do my work. So after about a half hour of talking to a helpful T-Mobile rep named Sidney, I came up with a combo that gets me everything I want. I’m now spending $10 less per month, and I have unlimited T-Mobile WiFi hotspot access from my laptop as well. What I gave up were 90% of my peek hours and evenings. After I got off the phone with them, I suspended my XP laptop at home, walked over to the closest Starbucks, and connected. A remote desktop session that I had running to my PC at the office came right back, even though it was a different WiFi network and had been in suspend mode. Thing have really improved.
Blog pinging and Pingomatic
I plan on understanding a lot more about how pinging works in blogging systems. I've built blogging systems, but before all this pinging stuff was going on, so nothing on that system becomes part of the blogosphere, proper. I just submitted MyLongTail at the Pingomatic site. And a lot has to be learned just from that process—not the lest of which is simply the names of the different pinging services. Pingomatic even shows you the feedback of the ping. I don't know if Pingomatic is using Web Services or simulating a webpage submit. But if this isn't an application built for Web Services, I don't know what is. For posterity, and for later review, I did a screen capture. Now that I did this ping, MyLongTail is really going to be in the blogosphere, because who knows what happens as a result of a one-time ping. There will very likely be discovery-bots sent out, and automatic revisits without pinging by proactive news gatherers. 
And for the sake of interest, here's the spider visitation within the first half-hour of doing this blog-ping. Some of these are crawlers that I haven't seen on MyLongTail before. Hmmmm... 
TrackBack, Link Farms, Jagger Update and Blogger
 How much do I miss the track back feature by going with Blogger? Not at all. Why? New information shows just how non-helpful, and perhaps even damaging it can be. Perhaps even Blogger never implemented TrackBack intentionally, due to possibly knowing what was coming down the pike from parent Google—especially in light of the Jagger update from last November. Reciprocal links were penalized—or at least stopped delivering as much value as they used to. If every link you receive is reciprocated, you’re a link farm—at least as far as the Web topology you’re creating. To deal with this aspect of TrackBack, it’s generally all-or-nothing. You can use TrackBack or turn it off. The second strike against TrackBack is, of course, spam. People link to you and send a TrackBack ping specifically to get a link from your page, even if their site is totally unrelated. The reciprocal links with the unrelated sites goes even further to create that terrible link farm topology. There’s a thin line between the organic pattern created by a genuinely popular site, and the pseudo-organic pattern created by link farms. My money is on Google getting better and better at recognizing these automatic cross-link patterns. And like every other spam trap, there’s some sort of threshold. Stay below that threshold, and you’re golden. Go over that threshold, and your site is flagged for human review, or possibly even automatic banning. The real way these blogging software companies should implement TrackBack is to get rid of the silly pinging and TrackBack codes. Blog posts don’t need a unique identifier. The permalink page has a URL! That’s unique enough. The code system is too geeky, and can be automated. Analytics-like Tracking systems built into blogs should simply recognize people following links. If it’s a first-time referrer, it should send a crawler out to check the validity of the page (not all referrers are accurate), and put that link into an inbox queue in the blog user interface. The person running the blog can then go and visit each of the sites and make a human evaluation whether it’s worthy of receiving a link back. If it is, they checkbox it. This has a number of advantages. First, the human checking process will block spam. Second, it will pick up much more referrers than the TrackBack system in its current form, which requires action on the part of the person linking to your blog. This information is already being passed back and forth. Why not use what’s already there? Third, it serves as a sort of competitive intelligence gatherer for the blogger. They get to see all referring links to their blog as a matter of interest, without necessitating that they receive a link from you. The time has come, the Walrus said, to speak of many things. The do’s and don’t of SEO, of tracking-back and pings. -------------- An addendum to this post, moments after I published it: in going into Blogger's settings, I discovered the "Backlink" feature. It sounds like it's implemented much like I imagined. No codes are necessary. You just turn it on. So, I did (to get the experience). If I think it's starting to create a link-farm pattern, it gets turned off, pronto. It will be interesting to see what happens. It says that it uses the "link:" feature, which makes me think that the referring site has to be in the Google index, and perhaps even have passed whatever criteria they use to reduce the number of actual results reported by it. That would perhaps deal with the spam issue, if the site linking to the post needs to have, say, a PR of over 4. Labels: Mike Levin
Under-Promise, Over-Deliver
 I woke up remarkably early, all things considered, and sent out an email informing my team I’d be taking Monday on the MyLongTail project. I have such momentum, and am so ready to wire up the main homepage to be ready for the first visitors, that I’d be crazy to go into the office, engage, and risk putting it off for another week. My to-do list for today looks like this… - Create the template files.
- Put placeholder files in location for FAQ, SEO Best Practices, Why Sign Up.
- Fix the navigational links to point to these new placeholder files.
- Put the new navigational links into the Blogger templates.
- Figure out how the babystep tutorials are going to be linked in.
- Link in the first tutorial, and the respective spiderspotter app.
- Connect the submit form to lead management.
- Start putting content on the placeholder pages.
The work that I need to do on my second round of intensive focus include… - Final thought-work on the actual MyLongTail app.
- Creating the MyLongTail app.
- Giving a flavor for its power directly on the MLT homepage.
- Start communicating with the people who signed up early—probably create a public forum, so I can efficiently communicate with all of them at once, and they can communicate with each other.
- Ensuring that the conversations that are developing into Connors new client prospect opportunities are being handled properly.
One pitfall to avoid is actually acting on the information that the spider spotter app is revealing to me. For example, Bitacle bot has been trying to retrieve my atom.xml file from the wrong location. I realized the path I set to the XML feed in my Blogger settings was incorrect. I fixed it, but realized I had an absolutely fascinating app to write: one where I could measure the time between me submitting a blog entry, and spiders request that page or the data-feed. I think I’ll make a list of lists that I need on the MyLongTail site. - Apps that I need to write (which will also become tutorials)
- Markets, industries and technologies that I want to target
- People that I need to reach out to (the influencers), and the message I need to deliver to each, based on their interests
- Topics for the SEO Best Practice
- Questions for the FAQ section
- Topics that I intend to blog about
- Pitfalls to avoid
Perhaps the biggest pitfall of all is over-promising. There is little as damaging as building up expectations, only to be let down. I stand the danger of over-promising to two different audiences: Connors (specifically, Connie), and the people who sign up early for MyLongTail. I have to start with a small, but potent kernel. MyLongTail will be modest in how far it’s reaching, but designed to strike a fundamental chord—one that’s in the tornado’s path. There’s no need to over-promise, because that small kernel is totally enough—and I have to focus on over-delivering that one small piece. That’s very Web 2.0 thinking, by the way. Because everything interoperates, relatively easy, people can write mash-ups based on your app. Each person writing their mash-up is likely to have way more expertise in their problem domain than I do, so what they write USING my service is better than what I could write alone. My role then becomes to put out a few sample mash-ups to stimulate everyone’s imaginations. The MyLongTail app will be one of the first Web Services for SEO. Hopefully, it will have the same attractiveness as Tag Clouds, Blog Roles, Bookmarks, and all the other things that are serving as mash-up fodder and material for blog templates. Of course, Google Maps is the ultimate mash-up service, and I will continue to use it for inspiration. But no over-promising! Labels: Mike Levin
NYC PR Firm and SEO
 OK, here I am at Hollywood Diner Sunday night at 1:00AM evaluating how I did over these past four days. There’s still I’d like to do tonight, but a reasonable person would put it aside, get some good sleep, and be in the office tomorrow morning to catch up. I took 4 continuous days, Thu-Sun, to focus on this project. I basically ignored all emails (and that made all the difference), and bore down on the work. Am I happy with my progress? Does it match what I visualized? What I finished has indeed matched what I visualized very closely. I just haven’t finished as much as I would have liked. The baby-step tutorial markup project took two full days. But I knocked a lot of foundational issues out of the way. I’ve committed myself down the Microsoft, VBScript route in order to get the project finished. I have the first full tutorial done. I have the spider-spotter application finished. I have the homepage designed and implemented. I just don’t have Lead Management wired up, don’t have placeholder pages for the different top-level navigation pages, and don’t have the tutorial or spider-spotter app actually linked in. I did not achieve my objective of having this site operational as an opportunity generator before the weekend was out. But it’s all set up, just waiting to be hit home. Lead Management is working on the Connors site, and I could move it over quite easily. And it’s still early. I’m having coffee and getting some food at the Diner. So, I should be set until 5:00AM again. But I can’t do that if I’m committed to meetings tomorrow morning. It actually looks clear enough. I’ll have to send out an email that I’ll be taking another day. I really shouldn’t have to feel guilty about focusing on this. This is the value I have to bring to Connors—much more so than client management. Our people are very good, and self-sufficient. I’m mostly there for high-level guidance, a backup net, and for new business development. I want to be in on Tuesday for an on-site client meeting on one of the more detailed SEO projects that we do (URL re-writing). Once finished, the MyLongTail website will elevate Connors’ role in the PR industry from being one of NY’s top PR firms that specializes in emerging technologies, to being an emerging technology company itself. At very least, it will be a PR firm that can demonstrate its technical chops in a very public, very glitzy way. So, how to get from here to there? Building the actual MyLongTail application, which I still haven’t really talked about yet, is the biggest part. The next step is to practice what I preach. By making the MyLongTail site massively successful, documenting as I go, I’ll be spelling out the MyLongTail formula and process. What I’ll be doing will actually go beyond the prescribed MLT-formula, but those things will actually be part of the playbook under SEO Best Practices. Even SEO Best Practices is a misnomer, but it’s the best label right now for the audience-building task. We will address all aspects of online marketing, publicity and promotion that are unpaid. Of course the employee’s salaries are going into the work, their electricity, rent, and all other burden costs. But what is not going into it is a large marketing budget. It is quite possible for a single, passionate individual to outperform an entire marketing department through word-of-mouth evangelism. The Internet and Web simply bring automation and persistence to old fashioned word-of-mouth. Search is a special part of the equation, because it’s the wildcard, and the one on which fortunes can flip-flop. It’s the area that has an amplifying effect that you don’t have to pay for, resulting in getting more out than what you put in. So, isn’t Connors setting up its own competition with the MyLongTail site? In some cases, yes. We will be spelling out a process whereby a dedicated individual within a company can create a lot of publicity for themselves without hiring an outside company, and with much less investment than a traditional marketing budget full of advertising and events. In a very real way, we will be teaching them how to do a very advanced form of high-tech PR—exactly our specialty. Then, why is Connors doing this? Because the total number of people needing this far exceeds what we can service, and we would rather have this relationship with you than not. We would rather be the ones ushering in this next evolution of search marketing than not. How can it not result in anything but good for Connors, and those we will have the privilege of serving? We believe in sewing a thousand seeds and seeing what blossoms. UPDATE: Connors has evolved from traditional PR to high end search engine marketing. Click here to learn more about our transition - http://www.connors.com/seo/letter.htmlLabels: Mike Levin
Foundational Design and SEO Considerations
 Now, it's time to consider aesthetics and search optimization considerations. As many people in the SEO field will tell you, there is a balance to strike between SEO best practices, and design perfection. If you were just going after perfect search optimization and usability, everything would look like Jakob Nielsen's useit.com (talk about a dated look). But to go after uncompromising design, you would do the entire thing in Macromedia Flash, and make the entire site invisible to search, defeating a primary purpose of a website—generating the sales opportunity in the first place. But just as a skilled poet can communicate perfectly while maintaining pentameter and rhyme, a skilled Web developer can seamlessly combine optimization and design. I have three extreme advantages going in my favor... - I'm creating the site from scratch, so all my decisions are foundational.
- I'm a V.P. of the company AND the entire art/programming team on this project, so I have no artist to satisfy.
- I'm proceeding with a very clean and sparse Google-like look, so art's not a large project.
I should not use the Google sparse look as a license to go boring. Remember my comment about Jakob's site? I don't want to be a hypocrite. So, I do indeed plan on spicing up the look of the site. I'm quite partial towards the look of the blog-site too-biased, the blog site for which the Ruby on Rails Typo program was developed. So, what design parameters do I have to work with? - Logo
- Value proposition in the form of a tagline
- Navigational elements
- Pervasive Sign Up form
- Space to push out message du jour—when the MLT app is done, this space will be where we give a preview of the sizzling visual of MyLongTail.
The logo placement is already decided. The sign-up form will initially be just a single line positioned like a search box. The initial tagline is already written. So, I need to nail down the navigational elements. I think I'll make them very Google-like in that they're plain text links, easily changeable, and implying tabs even though the tabs aren't really there. Plan text links being used as-if they were tabs (and maybe making them look like tabs wholly through CSS) is a perfect example of the 80/20 rule in design. I could spend a whole weekend just designing a cool tab look that someone, somewhere would hate (or would break some browser). Design is so subjective and pitfall-ridden, that you have to choose your battles carefully. I'm sure to do a dedicated post on that topic later. But for now, I'm starting out with the navigational elements: - Home
- Why Sign Up?
- SEO Best Practices
- Blog
- FAQ
The visual proportions and weighting when these words are laid out are perfect. I would like to add the terms "PR" and "Pod" to the navigational links, but it really throws off the balance right now, and I won't have content to add to the Pod section right away. Everything from the PR link could be put under the FAQ link. FAQ feels a little old school, but PR is too obscure. Everyone understands what an FAQ is these days. And everyone understands Blog. But even Blog is getting to feel a bit old school. Pod is the way to go, but I can't right now. I'll be able to populate the FAQ quite easily using the CMS. But I will be able to do Pods soon. I'm going to set up a tiny but adequate audio/video production facility. Talk about humanizing a sight. I can video-document the birth of a Web 2.0 company, and my becoming a part of the Manhattan scene. I'll try to produce something that maybe could be picked up by Google Current (Google's cable TV channel). They call Pods VC^2 for viewer contributed content. I'll attend the iBreakfasts and more conferences, helping generate buzz for my own site by promoting them. Maybe I'll pitch the idea to my neighbor, Fred Wilson, who publishes submitted Pod-format elevator pitches on his Union Square Ventures website. I'm not going to make an extensive PodCasting post here, but it does merit mention. Just as developing public speaking skills is necessary for certain types of careers, being able to speak well is turning into an optional, but compelling part of Web publishing. Another important aspect is that I'm putting all my content, with the exception of the logo, into plain text. The headline and tagline definitely look better as graphics. But I need every scrap of SEO-power I can muster in constructing this site. As is the overwhelming trend these days, I'll be using div's to format style. But unlike today's trends, I'll very deliberately be using tags like p (for paragraph) and b (for bold) to keep the semantics in place. Nothing has set back the Semantic Web like the proliferation of the use of meaningless div id's, and the stripping out of all conventional document context. HTML tags like h1, p, b, i, blockquote, and many others are still very much worth using, because it is part of the clues you're leaving search engines about what's important. Just use div's to block together elements for stylization. Span's are an interesting question, because they are inline. On the one hand, you can avoid them completely by putting id's on elements such as bold or italics. But then you change the conventional presentation of these tags and risk the search engines parsing engines not knowing what they are at all. A compromised solution is to continue to use bare-bones tags like b or i, and just put a span tag AROUND the conventional HTML tag. It's a bit of extra code, but it purges out all ambiguity. You know with certainty that div's and span's will be parsed for attributes. It's also very likely that a lot of dumb parsers are not expecting parameters on p's and i's. So, this combination removes all ambiguity, and forces search engine to accept the meaning that you intend. May be misinterpreted... <b id="bling">stylize me</b> Withholds information from the Semantic Web... <span id="bling">stylize me</span> A little bit of extra code, but cannot be misinterpreted... <b><span id="bling">stylize me</span></b> There are important facts to consider. If you are willing to make all your "b" tags look alike, you can just create a style that applies to all your bold's. This is how so many blogs change their anchor-text style from a solid underline to a dotted underline. If you're able to do this, you don't need the extra span tags, and you don't need ID's on your bold tags. That's another best-case scenario. But what I'm considering here is the main homepage of MyLongTail.com, and main homepages always have a different set of rules. You need to stylize elements on a case-by-case basis without affecting the whole rest of the site. Now the issue to keep in mind here is the "C" in CSS. C stands for cascading, meaning that how things are nested controls which style wins. Last style wins. Inline elements like span cannot/should not contain block element attributes. So, you can't use margins and padding on a span element. Use div's when it's like a blockquote, and use span's when it's like a bold. Styles are rendered outside-in. That is, the definition of span will override the b tag. This is great for inline text, and really helps with the Semantic Web. But when you're using the above bling example with a paragraph tag, it doesn't hold up. Div's and span's are only meta-container tags. That is, they only exist to contain other elements, and add some meta-data such as ID's and class names, and imply nothing about content relevancy or importance. Everything belonging to such a unit belongs INSIDE the container—especially if you're using the container to move things around, such as you do with div's. So you see, you can get away with the bold tag outside a span, because it will cascade properly, and you never MOVE a span. You get the semantic value of the bold tag, but the span tag wins in applying style, because it's working outside-in. But you can't do that with div's, because a bare-bones paragraph tag inside a div tag will override the div's style with the paragraph's default style. And you can't change the default paragraph's style without affecting the rest of the site (or page), and you can't add an ID to the p or you throw off creating a perfect document structure for the Semantic Web. It's something of a conundrum, and those who can solve it get a sliver of potential SEO advantage. Many things in SEO are not about whether they definitely provide a boost today, but are rather about whether they may ever produce a boost someday, and can never be interpreted as bad form or spanning. Often, the solution is to stick to the bare bones HTML code in order to get the semantic advantage, but to use a second style definition for just that page that overrides the global style. Practices like this may seem over-the-top, but it's the weakest link in the chain principle. Much more on that later. Well, this has been quite a post. I could break it into smaller posts, but it really was part of one unit of thought, so I'll keep it intact. But that leads the SEO issue of what to name the post. The title transforms into the title tag, the headline of the permalink page, the words used in anchortext leading to the page, and the filename. This all combines to make it the single most search-influential criteria for the page. If I were going wholly for optimization, I would break this post into many smaller posts, using the opportunity to create more titles, and consequently more sniper-like attempts at Web traffic on those topics. But more on that later! Labels: Mike Levin
Short-term Objectives
 OK, I’m effectively done the first of the two spider spotting projects, and I think I’m going to not do the second one today. The first project has given me the structure for stepping through all my log files and extracting what I need for the second project, so there is no urgency. No data is being lost. Where there is urgency is getting the MyLongTail site a little more ready for prime time. Not that it will be a compete app, or even make a lot of sense to people right away. But it can no longer look like a work in progress. Thanks to the popularized “clean” Google main homepage look, it’s quite easy to make a site look finished when it’s not even close. That’s what I’m going to do this weekend. But I need to guide the precious fleeting focus time left with a plan. Keep in mind the 80/20 rule principle, because it’s really got to be applied here. What are some of your objectives? Create the template files from the CMS system, so you can wrap any of your ASP files in the rest of the site’s look. I do something like use Server Side Includes (SSI), but I don’t physically break the master templates into header and footer files. I find it more powerful to keep template files in one piece, and just mark them up with content begin and end tags. Put the first babystep tutorial onto the MyLongTail site. You went through all this effort to produce the first one. So, you need to plug it in. Also, add the spider spotter application and cross-link it with the tutorial. I should also think about cross-linking with blog posts. I have sort of a structure going here: - Thought-work - Baby-step Tutorial - The Application Too many words. Can I abbreviate it? - Thoughts - Baby-steps - The App They each have a strong, distinct identity. That’s good. I don’t think this is something that will really last once MyLongTail starts to go mainstream, because it’s a little two tech-geeky. I’m not recommending with MyLongTail that the average marketing person go through these tutorials. But whenever a Marketing person wants to give their group a competitive advantage, I would like to provide him/her with a convenient link to forward to their Tech. OK, another objective of MyLongTail is actually to find prospective clients for Connors Communications. As the world is getting more technical, many public relations firms are getting left in the dust. They’ve been able to catch onto blogging in great part, because blogging software is just so simple. But that’s not enough. You need to know how to give your clients cutting-edge advice regarding their corporate blogging strategies. Now consider, that this site started getting spider visits within days of being created, and a search engine submit never even occurred. Why? Because I planted the blog on the same domain as the main site, transferring Google juice to the “main” corporate site. And the blog search hits are still valuable from a sales standpoint, because the blog is wrapped in the main site’s navigation. You are always presented with the company’s encapsulated message (logo, tagline, etc.), and are one click away from the main homepage. This is not always the direction you want to go, but how many PR firms can speak to these issues with authority? What’s more important, search engine optimization or having a corporate blog? What is the relationship between the two? Should the blog be on the main corporate site, or its own separate domain and entity? How can search optimization be done without risking future banning? What can we do if we’ve committed to a particular Web technology infrastructure that prevents us from performing search optimization? So, that secondary objective is generating new prospective client opportunity for Connors, and my goal for today is to have an easily-applied template look, and to activate the sales lead acquisition and management system. Such a system worked like gangbusters for me in the past, because it was a unique and differentiated product. But now, I’m in the PR industry. So, MyLongTail will be Connors’ unique and differentiated product that you have to sign up to get. But it won’t be done in a week, so I will need some simple explanation of what MyLongTail is—enough to entice volunteering contact info. And there should be two different ways to capture contact data: 1. Email-only, which is just enough to do some sort of follow-up. 2. Full contact info, necessary for more thorough follow-up. I want to make a very strong value proposition and teaser to get people to sign up. But I also want to start putting the right sort of content here (and on the Connors site) to draw in promising prospective clients. For a little bit of time, the MyLongTail site is going to be a little bit of a playground. I want a way to mention all the various industries who could benefit by using the MyLongTail site. I also need to talk about a lot of marketing principles and how they apply in the evolving online landscape. I should really nail down what the main navigational elements are, because that’s going to inform, guide and influence the rest of the development of the site. It also will have an implied version of the site’s value proposition. - SEO Best Practices OK, I’m saying that MyLongTail is a better way. So, why not… - SEO Good Practices - SEO Best Practices Is it an SEO site? Yes, for now. But it will also be a public relations site. And I want to keep the message VERY simple. - Why Sign Up? - SEO Best Practices - PR - Blog - FAQ Then, there are the items beneath the surface of the iceberg (more on that philosophy later). Those include… - Emerging markets, industries and technologies - The Baby-step tutorials - Marketing principles, traditional and new - Geek issues, like watching spider activities So, the to-do list reads like this… 1. Adjust the navigational links. 2. Create the template pages. 3. Put place-holder pages in for each link. 4. Turn the main homepage into an email address collector. 5. Make the email response page offer to start sales lead process. Issues to keep in mind - I’m going to want to plug in the first babystep tutorial. - I want the main homepage to feature the latest thing: tutorial, blog post, etc. - I need to make it compatible with lead management on the Connors side. Labels: Mike Levin
Caffine is My Drug of Choice
OK, let's get the first little nested project out of the way. Find a post that meets the condition of having babystep code, but the previous post doesn't. But back a little further, there is more babystep code. It's a recursive app, going back in time feeding the most recently considered post ID as a parameter, plus the master message ID. The function, when given a master message ID, it looks at the immediately prior post in that same discussion to see if it finds a bapystep post. If it finds a match, it returns that post's ID. If it doesn't find itself, it calls itself. This relies on the newly found ID bubbling up through the recursion. But of course, I don't trust that in VBScript, so I'm going to use a global variable. The recursion automatically ends when it reaches the master ID, which is the first post in the discussion. That project is out of the way. It's 12:30 midnight on a Saturday night in Manhattan, and I'm just getting underway with a programming project. Sad. But that's my choice. It's only with this sort of mad dedication that truly inspired projects come to fruition. I've had too much time feeling like I was just spinning my wheels not getting anywhere. It's time now for that drive that gets wasted on term papers in college. I often think how much greater the world would be if the youthful energy that gets dumped into diplomas to hang on the wall, and stupid rights of passage, actually got funneled into entrepreneurial projects with a positive social impact. The world would be a much better place. Anyway, to build and keep the momentum for the spider-spotting project, I need caffeine. Time to run out. This site is called MyLongTail, because it is going to focus on the long tail of search, and ways to tap into the power of unpaid search without resorting to shadowy practices. But I'm thinking I may also want to call it full MyFullLifecycle, in how it's addressing two full lifecycles: first, the birth of the site itself. This goes from the creative parts, to the first spider visits, to the first search hits, to the first user feedback, to the first user of the service, to the de-geekifying of the site once it starts to catch on, to the site's rise to popularity. But it also will be very concerned with the lifecycle of the customer, from getting into their head to know what type of searches they're going to perform, to finding MyLongTail, to eventually providing contact info, to signing up for the service, to productively using the service, to measuring this user as a win or a loss based on them getting the next person in (more on that later). But you can see, I'm thinking in depth about both the lifecycle of the site, and the lifecycle of customers using the site. OK, the re-engagement process is important for maintaining focus. I went to grab a bite to eat, and pick up some caffeine. When I got back, I immediately wanted to plop in front of the TV and vedge. I see that I am in constant need of stimulation. TV provides it way too easily. I've got to switch to radio and music, so I can keep doing it even while I'm working. But I've never much been one for music. Nothing ever pulled me in to really make a fan. You can count on one hand the number of CDs I bought. And the things I like are usually so offbeat that they don't even constitute a genre. So, I'm using Pandora to find more music I might like based on the handful of things I really enjoy. But the Animaniacs and Eric Idle haven't made it into the Music Genome Project. I really like novelty music. My best luck so far has been from putting in the seed song "The Lime in the Coconut". It describes the station as mild rhythmic syncopation, heavy use of vocal harmonies, acoustic sonority, extensive vamping and paired vocal harmony. I tchose Caffine by Toxic Audio, which I enjoyed and found appropriate, so I guess it works. OK, let's really get started with spider spotter project #1. It's 1:20AM. It seems like I piddled away hours since I started, but not really. I actually made most of the design decisions in my head. I can jump into this thing head-first. I'm really excited about creating my first publicly consumable baby-step tutorial. This is one that will actually be of great use to some people.
MSWC.IISLog or the TextStream Object to Parse Logfiles
OK, the first step in the first spider spotter project is choosing which technology to use to open and manipulate log files. There are basically two choices: the TextStream object, and the MSWC.IISLog object. Both would be perfectly capable, but they bring up different issues. The power of manipulating the log files as raw text comes in using regular expression matching (RegEx). But doing RegEx manipulation directly within Active Server Pages requires dumping the contents of the log file into memory and running RegEx on the object in memory. And log files can grow to be VERY large. One way to control how much goes into memory is to encase the ReadLine method of the TextStream object in logic to essentially create a first-pass filter. So, if you were looking for GoogleBot, you could pull in only the lines of the logfile that mention GoogleBot. Then, you could use RegEx to further filter the results. The other approach is to use MSWC.IISLog. I learned about this from the O'Reilly ASP book. It essentially parses the ASP file into fields. And I'm sure it takes care a lot of the memory issues that come up if you try using the TextStream object. One problem is that it's really an Windows 2000 Server technology, and I don't even know if it's in Server 2003. It uses a dll called logscrpt.dll. So, first to see if it's still even included, I'm going to go search for that on a 2003 server. OK, found in the inetsrv directory. So, it's still a choice. The next thing is to really think about the objectives of this app. It's going to have a clever aspect to it, so the more you use it, the less demanding it is on memory. And I'll probably create a dual ASP/Windows Scripting Host (WSH) existence for this program. One will be real-time on page-loads. And the other will be for scheduled daily processing. Even though it's really not worth pulling in the entire logfile into a SQL database, it probably is worth pulling in the entire spider history. Even a popular site only gets a few thousand hits per day from GoogleBot, and from a SQL table perspective, that's nothing. So, why write an app that loads the log files directly? It's the enormous real-time nature of the thing, and the fact you'll usually be looking at the same day's logfiles for up-to-the-second information. So, the first criteria for the project is to work as if it were just wired to the daily log files. But lurking in the background will be a task that after the day's log file has cycled, it will spin through, moving information like GoogleBot visits into a SQL table. It will use the time and IP (or UserAgent) as the primary key, so it will never record the same event twice. You could even run it over and over without doing any damage, except maybe littering your SQL logs with primary key violation error messages. MSWC.IISLog has another advantage. Because it automatically parses the log file into fields, I will be able to hide the IP addresses on the public-facing version of this app if I deem it necessary. Generally, it will only be showing GoogleBot and Yahoo Slurp visits, but you never know. I'd like the quick ability to turn off the display of the IP field, so I don't violate anyone's privacy by accidentially giving out their IP addresses. OK, it sounds like I've made my decision. I don't really need the power of RegEx for spotting spiders. IIISLog has a ReadFilter method, but it only takes a start and end time. It doesn't let you filter based on field contents. OK, I can do that manually—even with RegEx at this point. If it matches a pattern on a line-by-line basis, then show it. Something else may be quicker, though. OK, it's decided. This first spider spotter app will use MSWC.IISLog. I'm also going to do this entire project tonight (yes, I'm starting at 11:00PM). But it doesn't have nearly the issues of the marker-upper project. And it is a perfect time to use the baby-step markup system. I do see one issue. There are two nested sub-projects lurking that are going to tempt me. The first is a way to make the baby-step markup able to get the previous babystep code post no matter how far back it occurred in the discussion. That's probably a recursive little bit of code. I think I'm going to get that out of the way right away. It won't be too difficult, and will make the tutorial-making process even more natural. I don't want to force babystep code into every post. If I want to stop and think about something, post it, and move on, I want to feel free to do that. The other nested project is actually putting the tutorial out on the site. I've got an internal blogging system where I actually make the tutorials. But deciding which once to put out, how, and onto what sites is something that happens in the content management system. Yes, the CMS can assemble Web content for sites pulling it out of blogging systems. In sort, the CMS can take XML feeds from any source, map them into the CMS's own data structure, apply the site's style sheet, and move the content out to the website. But the steps to do this are a little convoluted, and I have the itch to simplify it. But I'll avoid this nested sub-project. It's full of others.
Evaluating Spider-spotter Projects
The baby-step documentation system is working, and now it's time to build the 2 spider-spotting projects up from scratch. Now that this site has a little bit of content on it, and posts have been made with the blogger system, and people have surfed to it who may have toolbars that report back the existence of pages, and because I have a couple of outbound links that will begin to show up in log files—because of all of these reasons, the first spider visits will start to occur. And that's what we're interested in now. But are we tracking search hits yet? No, that comes later. So, how do we monitor spider visits? There are 2 projects here. First, is specifically monitoring requests for the robots.txt file. All well-behaved spiders will request this file first to understand what areas of the site are supposed to be off limits. A lot of concentrated information shows up here, particularly concerning the variety of spiders hitting the site. You can't always tell a spider when you see one in your log files, because there are so many user agents. But when one requests robots.txt, you know you have some sort of crawler on your hands. This gives you a nice broad overview of what's out there, instead of just myopically focusing on GoogleBot and Yahoo Slurp. The second project we will engage in will be a simple way to view log files on a day-by-day basis. Log files are constantly being written to the hard drives. And until the site starts to become massively popular, the log files are relatively easy to load and look at. ASP even has dedicated objects for parsing and browsing the log file. I'm not sure if I'm going to use that, because I think I might just like to load it as a text file and do regular expression matches to pull out the information I want to see. In fact, it could be tied back to the first project. I also think the idea of time-surfing is important. Most of the time, I will want to pull up "today's" data. But often, I will want to surf back in time. Or I might like to pull up the entire history of GoogleBot visits. It's worth noting, that you can make your log files go directly into a database, in my case, SQL Server. But you don't always want to do that. I don't want to program a chatty app. Decisions regarding chattiness is a concept that will be coming up over and over in the apps I make for MyLongTail. And exactly what is chatty and what isn't is one of those issues. Making a call to a database for every single page load IS a chatty app. So, I will stick with text-based log files. They have the additional advantage that when you do archive them, text files compress really well. Also, when you set the webserver to start a new log file daily, it makes a nice system for writing a date-surfing system. For each change of day, you simply connect to a different log file. It will always be an issue whether thought-work like this ends up going into the blog or into the baby-step tutorials themselves. I think it will be based on the length and quality of the thought-work. If it shows the overall direction the MyLongTail site is going, then it will go into the blog. So, this one makes it there. Time to post, and start the tutorial. Which one comes first? Am I going to slow myself down with copious screenshots? It actually can be quite important for an effective tutorial. But it can make the project go at almost half the speed. So, I'll probably be skipping screen shots for now. So, the robots.txt project or the log file reading project? There is definitely data in the log files, if even it's just my own page-loads. But there's not necessarily any data if we grab right for the robots.txt requests. That would make that app difficult even to test with no data. Except, I could simulate requests for robots.txt, so that really shouldn't stop me. So, I'm going to go for the easiest possible way to load and view the text files.
Blogging and Search as Mainstream Media
 That last entry just shows you the difficulty of separating work and personal on an endeavor like this. It’s going to be all-consuming for awhile. Balancing it with personal life isn’t (right now) about balancing it with rich social activity. It’s more about balancing it with keeping the apartment clean and paying the bills. I will be constantly working to make MyLongTail publicly launchable in under 2 months. Connie told me I can bring in whatever help I need to get this done. But even just explaining what I have in mind adds too much overhead to the project—especially in light of what agile development methodology makes possible. Agile and Web 2.0 go hand in hand perfectly, with their bad-boy, contrarian approaches. It’s a thin line—the separation between Agility and hacking and Web 2.0 and un-professionalism. The difference being that the big down-sides are removed. Agility provides hacking that has long-term scalability and manageability. Web 2.0 provides parts that can be glued together so single people can TRULY write apps that even better than what used to take large teams. The two big enterprise frameworks promised to do this: .NET and JR2EE. And I tried both. Problem being, from my standpoint, the lack of agility. Consequently, my decision for now to stick with VBScript, and for later to go to Ruby on Rails. Not every journal entry like this should become a post right away. In order to keep even the thought work of separating and designing posts out of the picture, I’m going to run with a stream of consciousness entry like this throughout the day, when I can. Little chisel-strike posts on programming concepts will probably go into the CMS/baby-step tutorials throughout the day. This entry will be to process thoughts and keep me on track. This project is acquiring the momentum that it needs. I have had difficulty drowning out the thoughts related to my previous employer because the nature of the work there got so devastatingly interesting. What I did there was take a bunch of apathetic slackers who knew that the investor gravy train would never run out, and made them care about sales. Metaphorically, I both led the horse to water AND forced it to drink. The details could constitute a book, suffice to say it involved generating the opportunity through search hits, capturing the contact data, and attempting to force follow-up through business systems. The company busily occupied itself with documenting the fact that they were not interested in making sales, creating an untenable situation that culminated in, what I feel, was an attack on my career. This took the guise of a battle over resources. By the time the dust settled, I was left standing and new leadership took over who was sympathetic to my cause. I have since moved on to greener pastures, but this dramatic experience flits into my mind on a regular basis even now, because there’s nothing even more interesting yet to replace it. I need a very big challenge that exercises my mind as opposed to my time-management and juggling skills (key aspects of the agency environment). MyLongTail needs to become that challenge. It needs many similar aspects of what I did at my last place. But whereas that place had a downloadable product that fueled the machine, the field of public relations is very undifferentiated—even if it is a leading NYC PR firm. So, two things are changing everything. They’re both Web-based. The first is search engines. How many things since TV, phone, car and email have changed the way we relate to the world around us? How many times a day do you turn to a search engine for answers? Second, is blogging. Yes, the Web had tremendous impact. But blogging gave individuals equal voices to large, well funded corporations. Because something individuals had suddenly became made them rival large corporate budgets in terms of influence. That is the ability to publish quickly, without bureaucracy, without friction, and without editing. Coupled with search engines, individuals who would previously have fired off letters, fired off posts. But this huge vocal advantage is not reserved for angry letter-writers. Mainstream media people are equally embracing this phenomenon. But more interesting than the companies who are forced into having a “corporate blogging strategy” are the individual journalists and thought-leaders who run their own rouge blogs independent of their employers. You will sometimes hear of these folks, who once spoke FOR the mainstream media AS the mainstream media. Yes, their opinions may be used on their TV broadcasts and editorial columns, but you will often hear the thoughts formulating, and in a more candid fashion directly on their sites. MyLongTail is about leveraging these two big changes: the power of search, and the power of rapid, friction-free publishing. While MyLongTail doesn’t rely on blogging in particular, it does rely on developing the habits it takes to publish frequently, and publish well. In fact, I will be splitting it into two pieces: best practices for SEO, and best practices for publishing. I’m tempted to say best practice for “content”. But publishing, I think, gets to the heart of it. It’s about pushing out purposeful new material due to how it improves the quality of your site, and the site’s ability to pull in qualified search traffic. Labels: Mike Levin
Visualizing the Day
I’ve taken to naming my journal entries as the date, plus how I plan to use it. This one is 2006-01-28-personal.doc. I definitely don’t plan on publishing this one, because I’m planning to talk about how I get my apartment cleaned up today, PLUS work on programming. Yesterday, I started work about 10:00AM, and went to bed at 5:00AM. It was basically a 17 hour work-day. And I woke up about 7:00AM yesterday thanks to the cats, so it was almost a 20-hour day. I hope the baby-step color coding project was worth it. I think it will be, because of the effect it will have on the rest of my work. So, how do I make today effective on two fronts? First-off, lose no time on your old bad habits. No TV and no gratuitous Web surfing. So in short, no reward before the work is done. You don’t have anyone in your life who helps bring that sort of structure, so you have to bring it on your own. When I’m being lazy and neglectful, basically no one knows. I could be many times more productive than I actually am, if only I kept myself focused and working constantly—whether on mundane personal life work like keeping my apartment clean, or the interesting professional work. And since my employer has been gracious enough to let me pursue my programming passion to crank out this Web 2.0 app, I must go into hyper-effective mode to not let her down, and not let myself down. Visualize the end result, and work towards that. Since I’m working on two fronts today, I have to visualize two end results. The first is the clean apartment. That means I won’t be programming constantly. So the way to integrate the two types of work is to use the cleaning time to evoke inspiration. When the inspiration occurs, capture it right away—probably in a journal entry or in baby-step programming code. Roll something out quickly, then get back to cleaning. Plan on going to 5:00AM in the morning again tonight. That’s only 17 hours. On the work-front, the visualized end result for today is enough simply to monitor every move a spider makes on the new MyLongTail site—plus the documentation to show how I did it. Maybe this will become a public journal entry after all. Isn’t that the spirit of blogging, after all? Aren’t I doing this entire thing as sort of a voyeuristic form of performance art, showing how a single person can launch a Web 2.0 app. Meanwhile, it has the human interest elements of a Philly boy who recently relocated to Manhattan and wants to start taking advantage of the culture. I’m learning the PR industry, dealing in actuality on many fronts, including keeping my employer’s clients happy while I do this, and even help win new business. Some might say I’ve bitten off way more than one person can chew, and indeed, I started this all while maintaining a long-distance relationship with what I thought was the love of my life. Something had to give, and that relationship ended 2 months ago. Sighhhhh. OK, to launch into the day without distraction, quickly shower and run out to Dunkin Donuts for some coffee and nourishment. Carry your Sony voice recorder, so you can capture inspiration while not being tempted to sit down, settle in, and read news for an hour. I can even do that on my phone with RSS feeds, so I have to be particularly careful. You would think being informed up-to-the-moment in your field and world events would improve productivity. It doesn’t. It just fills your head with junk and distracts from the vision. I want to be one of the individuals helping to shape our world—not become a news junkie. And shaping our world takes the extra edge that putting the big time-sinks aside helps to provide. OK, go!
Babystep documentation system almost ready
Wow, I'm up to the step where I show what the diff output looks like. OK, that predisposes that I turned the current and previous babystep code into text files. So, it's time to fire up the FileSystemObject and the TextStream object again. I made heavy use of them in the first half of the project, but mostly for reading. This time, I'll be opening files for writing, and then they will be immediately read back in. After read in, there will be an object in memory that represents the new content for between the babystep tags in the current post. And as we've done recently, we will use the RegEx object to replace the babystep tag pattern match with the new content. And the resulting content gets updated back into the table, and voila! The application will be done. Right now, both halves of the project have lots of output. It's all really just debugging output. When I combine the two halves of the project, the output will actually be made invisible. Instead, it will reload the same discussion forum thread you are currently looking at, but will force a refresh. The process will not be automatic at first, so that I can retroactively apply it to discussions that already exist. Think clicking a "babystep" link over and over until the program is fine-tuned. It will be safe to re-run, so if there's still adjustments to be made, no real damage is done. I think I'll make a backup of the table beforehand just to be on the safe side. And this same system can be used to add a program code colorizer and beautifier if that ever becomes a priority. This will be absolutely fascinating to watch how it will affect my work. It is central to the way I work, and plan to maintain my focus and stay engaged. The best way to learn is to teach, and this is the best way for me to teach. It forces me to be systematic, and allows me to review process. It creates a play-by-play archive of a project, recording my thoughts at different stages. It will help other people when they work on similar projects, and will help me by allowing review and feedback by my peers. I'm sure professional programmers will cringe at most of my VBScripting work, and liberal use of global variables to avoid parameter passing. But these initial projects are not about purity. Neither are they about long-term manageability. They are about breathing life into a site quickly and starting to build an excitement level, that will justify me switching to my ideal platform, at which time I will be going through some very interesting learning processes, and documenting it all here in the babystep style. Process is an important characteristic of a project that rarely gets proper play. Programmers don't like to reveal their follies, and the book publishing model taught us to be efficient with our examples. Rarely would you re-print an entire program example to show how just a few lines of code changed from one page to the next. But that's exactly where the value lies. I can't count the number of times I looked for code examples on the Web, and had difficulty viewing the code out of context. Seeing it built up from scratch, especially when you go in steps of just a few lines at a time, can make programmers out of even the slowest learners. There is a reason for every line you put into a program, and those reasons get lost because the process flow gets lost. After awhile, it just becomes the finished product and you loose the sense for how you got there. Wow, this post about though-process on the babystep tutorial system was going to go in the internal system, but it provides such insight to the MyLongTail site, and the type of content that's going to be found here that I think I'll add it to the MyLongTail blog. I am also thinking about actually putting out the tutorial of the birth of the babystep tutorial system. I like the way that it is so self-referential. I will use the babystep documentation system to show the evolution of the babystep documentation system. It's all very circular. Some of the best systems are circular, and self-referential in this way.
Finding Your Longtail of Search
 So, a little more on the MyLongTail concept. I don’t want to give away too much before the application is real, and I’ve acquired a large enough installed base to have the early mover advantage. So, the real secret sauce will sort of remain a mystery until it is in full swing. This is a very long post, and it sort of draws the background to why MyLongTail will be so popular, and is in, as Geoffrey Moore might say, about to get carried into the tornado. MyLongTail is based on the fact that search engine optimization, as most of us know, is too complicated and mysterious to ever become mainstream. Yet, it must because of how much of a disproportionate advantage it gives to those who get it right. In advertising, you might spend millions on a Super Bowl commercial. In PR, you might get mentioned in the NYT or WSJ, but in SEO, you get that top result on your keyword day-in and day-out, every time anyone in the world searches on that term. And that is too important to ignore. Pervasiveness within the natural search results makes or breaks businesses. When the rules change and positions are lost, you can often hear cries of foul play. The wounded can launch into conspiracy theory regarding forcing AdWord participation. John Battelle picked one of the may examples of such people for his book, The Search. Despite initial resistance to pay-search, in the form of GoTo.com, it ultimately succeeded because of a very clear value proposition that the media buyers who control marketing budgets could understand. I pay x-amount. I get y-listings. It’s just like advertising. Not so with natural search! The rules of natural search optimization are always in flux, and there’s something of an arms race between spammers and the engines. Engines will never fully disclose how to position well, or else spammers will be able to shut out all the genuinely worthy sites. So, the trick for the engines is to always reward genuinely worthy sites, and the most important objective for any SEO is therefore to make their sites genuinely worthy. This concept of genuine worthiness is likely to stay around for a long time, because of how readily trust for a search provider can be broken, and how easy it is to switch. Think how little actual investment or commitment you’ve actually made to a search site. It’s not like you paid anything, or have any financial investments. Resultantly, search providers are uniquely vulnerable to the next big thing, which can come along at any time, prompting legions of users to flock away to the latest golden-boy darling site. It happened with AltaVista and Lycos, and could easily happen today, even with the 800-lb. gorillas-of-search. Yes, I firmly believe that the concepts of trust and the rewarding of genuinely worthy sites independent of advertising are here to stay. So, any company looking for that extra edge is obliged to look at their options in natural search. Enter MyLongTail. So, who determines whether a site is worthy? What actions can you take to ensure that your site is worthy by today’s criteria and the unknowable criteria of tomorrow? Craig Silverstein, one of the Google engineers who makes the rounds to the search engine conferences once stated that Google’s main objective in search results is not in fact relevancy. It’s making the user happy. Happiness is the main goal of Google. And a lot of efforts are going along these directions by integrating specialized searches, such as news, shopping, local directories, and the like into the default search. There is also personalized search, which makes the results different based on your geographic location and search history. So, things are changing rapidly, and there are many factors to consider when you ask what makes a site worthy. When everything is mixed together and regurgitated as search results, what is the single most important criteria affecting results that is unlikely to change over time? That is where MyLongTail is going to focus. Exactly what this most important criteria? Quality is subjective. Anything can be manipulated. Old-school criteria when AltaVista and Inktomi were king relied mostly on easily manipulated on-page criteria, such as meta tags and keyword density. Google’s big contribution is PageRank, which looks at the Internet’s interlinking topology as a whole. It’s a model based on academic citation system in publishing papers. The result was a broadening the manipulation arena from single pages to vast networks of inter-related sites, wholly intended to change that topology to indicate things that weren’t true. Today, the engines sprinkle in many criteria including fairly sophisticated measures of which sites were visited as a result of a search, and how much time was spent there. The engines also subtly change how the various criteria are weighted over time, which keeps all the manipulators scratching their head, wondering what happened, and spending months responding. This way lies ruin. At what point does the effort of manipulating search results become more expensive than just buying keywords? For most companies, it’s a no-brainer. The only thing trusted less than the search engines are the snake-oil salesmen claiming to be able to manipulate those results. Why risk getting a site banned? Why invest money in something that may never pay off? I could not agree more. SEO as it is known today is too shadowy and adversarial to ever become a mainstream service, and therefore a mainstream market. So, are you going to let your competitor cruise along getting that top natural search result, while you’re relegated to pay and pay—and even engage in competitive bidding frenzy to just hold your position? Of course not! And therein likes the rub. It’s a Catch-22. There’s no way out. Pay for keywords, or enter that shadowy realm. How do you get your natural hits today and have insurance for the future, no matter how things change? The answer is in the latest buzzword that’s coming your way. You’ve probably heard it already, and if you haven’t, get ready for the tsunami of hype surrounding the long tail. The term was apparently coined by a Wired writer, and has since been adopted by the pay-per-click crowd championing how there are still plenty of cheap keywords out there that can pay off big. The long tail concept, as applied to paid search, basically states that the most popular keywords (music, books, sex, etc.) are also the most expensive. They’ve got the most traffic, but also the most competition. But when you get off the beaten track of keywords, they dramatically ramp off with how expensive they are, and the list of available keywords in the “long tail” of the slope-off never runs out. That’s right—as keywords get more obscure, they get cheaper, and although the overall traffic on those keywords goes down, the value of the customer may even go up! So, the long tail of search has a very clear value proposition as applied to paid search, which today is principally Google AdWords, and Yahoo Search Marketing. What you do is ferret out those obscure keywords (through WordTracker, your log files and analytics, and brainstorming), run cheaper campaigns, pay for fewer clicks, and win bigger when they convert. The problem in doing this in the paid search arena is the work that goes into identifying these keywords, and migrating them over into a campaign is so complex. Traditional media buyers and the average person working in a company’s marketing department couldn’t handle it, so the work has been outsourced to search engine marketing firms (SEM), making a yet another new industry. But Google automates everything! Can you imagine tedious human busywork standing in the way of increased Google profits? So, why not just automate the process and let everyone automatically flow new keywords into an ad campaign and automatically optimize the campaign based on conversion data? Just write an app that figures out the obscure keywords in your market space, and shuttles them over to your AdWords campaign! Then, drop and add keywords based on how we’ll they’re converting. Before long, you have the perfectly optimized paid keyword campaign custom tailored for you. You can even do this today using the Google and Yahoo API’s and third-party products. But it is in the engine’s greatest interest to make this an easy and free process. This, I believe, is why Google bought the Urchin analytics and made the service free. Watch for some big changes along these lines, and for the still-new industry of SEM to have its world rocked. And so the stage is set for MyLongTail. Paid search is being fine-tuned into a money-press, but natural search is too important to walk away from. Yet, constant change prevents products to improve natural search from becoming mainstream. Therefore, the best deal in marketing today—pay nothing and have a continual visit of qualified traffic—is unattainable to marketing departments in companies around the world. They are shut out of the game, because when researching it, they get conflicting information, encounter a shadowy world, and get constantly corralled back to the clear value proposition of paid search. This has created a potential market whose vacuum is so palpable, that it’s always right at the edge of consciousness. It is a very sore pain-point that needs relief. It causes anxiety in marketing people whenever they search on their keywords and inspect the resulting screens. Yes, MyLongTail proposes to relieve that anxiety. The way it does this will be so above-the-table and distant from that shadowy world of SEO that I believe when the Google engineers inspect it, they will give a smiling nod of approval. For, MyLongTail will be automating very little, and it will be misleading even less. It will, quite simply, put a constant flow of recommendations in your hands to release the potential already exists. If your product or service is worthy of the attention you’re trying to win, from the market you’re trying to serve, then we will help you release the latent potential that already resides in your site. MyLongTail will help you tap into the almost inexhaustible and free inventory of relevant keywords that fills the long tail of search, so that you can get your keywords for nothing, and your hits for free. Labels: Mike Levin
Getting my day started
So, it’s 11:00AM, and I’m really only just getting started for the day. That’s fine, because I went until 1:00AM last night, and made such good progress yesterday. Also, today is Friday, meaning I can go as late and long as I want to without worrying about tomorrow. This can be a VERY productive day. I lost an hour and a half this morning trying to update my Symbian UIQ Sony Ericsson P910a phone with the latest iAnywhere Pylon client sync software. I convinced our IT guy to upgrade, so I could get the newly added features of syncing to-do items and notes—something I got very fond of with my old Samsung i700 PocketPC/Smartphone. The iAnywhere instructions say that I need to uninstall the old Pylon application at very minimum, and better yet, do a hard reset. Only two problems: the Pylon software doesn’t show in the uninstall program, and the process for hard resetting a P910a is some sort of secret. You can find instructions on the internet that involves removing the sim card and doing a series of presses, but it doesn’t seem to work. Anyway, I did a restore to undo any damage I did this morning, and decided to get back to MyLongTail. I’m sitting in Starbucks right now. I can’t find a free and unsecured WiFi connection, so I’m working offline right now. I am considering one month of T-Mobile hotspot access, and I see that they offer a special deal for T-Mobile voice customers. But I don’t want to put my credit card information in on a WiFi network, so I’ll do my thought work here, return home when I’m done my coffee or the battery starts to drain or I finish my thought-work, whichever comes first. The marker upper program that I wrote is just awesome. I think I’ll be able to crank out babystep tutorials in a fashion and at a rate that is unrivalled on the Internet. Indeed, MyLongTail may start to become known as a programming tutorial site. But I’ll have to maintain a separate identity for that area of the site, because I don’t want to scare away people who are just there for the main feature—a clear and easy to implement natural search optimization strategy. It’s more than a strategy. It’s a play-by-play set of instructions.
Long day
 Well, it’s just about 1:00AM, and I spent the majority of today on the marker upper project, which is just fine, because I’m thrilled with the progress I’ve made. All the logic is done, and now it’s just a matter of integrating it with my home-grown CMS system. The beautiful part is that it’s 1:00AM, and I started early this morning. So, the project is taking on momentum. It is easily as interesting as anything else I could be working on, which is key to managing distractions. As long as the main work is more interesting than any of the distractions, then the distractions have no power. It’s obvious that blogging will become a distraction, but as long as it keeps me on-course. The marker-upper project for superior baby-step tutorials will actually help me work my way through the MyLongTail project. Some pieces will become public and be posted here, but others will not. Yet, I still plan on using this tutorial method of building up the apps. My SEO team at Connors is the audience for those tutorials, but I will endeavor to make as many parts of it public as possible. Yes, I didn’t get to the other two projects that I had hoped, but this is only the first day of what I hope will be a 4-day focus-spree. I will have to do some other work over the weekend, but for the most part, I am going to try to make MyLongTail into something that can start getting some excitement going. At the very minimum, I need to start collecting contact info of people who would like to start testing it. I’ll be looking for early adopters. I think of this much like GoTo in the early days. A lot of people didn’t get it, but GoTo laid the framework for AdSense, and what analysts are saying is now over a 5-billion dollar market. I don’t think MyLongTail will be as gangbusters as paid search, because it is to paid search what public relations is to marketing. I’ll be posting much more on those topics, drawing the parallels between the “unpaid” promotion aspects of both PR and SEO. You might even call what MyLongTail intends to accomplish as PR via Search. Or perhaps “search releations”. Labels: Mike Levin
Web 2.0 and Lifestyle 2.0 in NYC
 I’ve lived in NYC for over a year now in this new job at the pr firm, Connors Communications, but I have hardly gotten out to see the city. It’s my own fault, but now that this MyLongTail project is becoming center stage, it threatens to swallow me up, and yet I want to get out and become a real New Yorker more than ever. So, I decided on a creative solution. I’m not much for the bar and nightlife scene, but I am a fiendish coffee drinker. And this topic for a blog post is just a silly tangent, but I want to create the blog post to add some color and commit myself to this project. I have a plan. It addresses getting rid of distractions that threaten the MyLongTail project, forcing myself to get out and see NYC a little more. I’m one of the schmoes paying over $200/mo. for cable TV, plus premium channel, plus high-speed Internet, plus PVR. I have a laptop, but can’t reliably get onto the Internet when I’m walking around, because so many of the strong WiFi hubs are pay services. And I live on West 16th Street, not far from Avenue of the Americas, so I’m probably pretty close to a pay-service hotspot. I don’t really watch that much TV, and prefer buying the DVDs anyway, or using BitTorrent to pull down the latest recordings, which I only need the high-speed Internet connection to do. My first inclination was to replace the $200/mo. charge with $15/mo. Verizon DSL, which seems to be the big deal right now. This would be contingent on being able to get Internet DSL without phone-service. From my Googling, it appears Verizon is being forced to offer that. I’ve gone wireless with phone using T-Mobile, because it gets me a decent voice plan and unlimited downloads for $60/mo. So, I’m already paying a pretty penny for phone and data. I see no reason to pay an additional $200/mo. for Internet. While the Verizon choice would be economical, it wouldn’t turn me into that wireless warrior that I want to be, so I can roam from Starbucks to Starbucks during the course of the day, taking in NYC. A little background on why that’s important. After over a year of a truly integrated lifestyle, living 6 blocks from work in the Chelsea section of NYC, I gained back almost 2 hours per day by getting rid of the commute, I find that I lost something of an inspired edge that I used to have. I isolated it down to the nearly hour-long car drive commute, where ideas were flying around in my head, processing at what must have been a subconscious level. When I sat down to do a project, it was almost like I had already discussed it in-depth (similar to this blogging). This applied to when I got to work, and when I got home at the end of the day (yes, I am a workaholic). But now, with my new integrated lifestyle, by going directly to the office environment with the distractions of the daily grind, home to the distractions of TV and cats, I lost that edge. I need to get back that edge pronto. And I might as well start taking in a little more of NYC in the process. We’re in the middle of winter right now, but it’s been unseasonably mild. Such walks will be invigorating, healthy, and provide good stopping points in which I can subconsciously process ideas, while motivated to my goal of feeding my caffeine addiction, which I will be better able to afford (even at Starbucks), having given up $200/mo. TV. Unlimited national T-Mobile hotspot service with a 12-month commitment is about $30/mo. That’s way better than $200/mo. Time/Warner cable plus RoadRunner, but its contingent on me being able to pull in that signal from home. I may even take it on blind faith that I will be able to, and will either buy a fancy WiFi antenna, or follow one of the many instructions on how to build one from chunky soup cans or Big Boy tennis ball cans. Hey, I’m in Manhattan, and if you can pull in a T-Mobile hotspot from home, then you can do it here. This blog post should also point out some points about corporate blogging strategies and SEO. Both quantity and quality of posts counts when it comes to SEO. Typically, you want to keep your posts on-topic to your site. But the occasional divergence, including humanizing the blog, spices up the average distribution of keywords that your site is targeting. While I’m not trying to attract hits of people looking for T-Mobile or Starbucks, these are both popular mainstream topics, which when mixed with all the other words mentioned on this page, helps to kick start the MyLongTail formula, which you will be learning much about in short order. This post is also a good example of Web 2.0 thinking, how services can be mixed and matched to suit your customized need (be them XML Web services, or phone or Internet service). It has little to do with the service provider’s intended need. We’re free to mash it up as we like in order to pursue increasingly individualized lifestyle choices or program applications. Yes, as Paul Graham points out, Web 2.0 is a contrived buzzword to justify a new conference. But it wouldn’t have become so broadly adopted if it didn’t strike a fundamental chord. Like Reagan telling Gorbachev to tear down this wall, O’Reilly and Battelle are telling developers to tear down the walls around walled gardens of service. I actually went ahead and posted that idea for a theme on John Battelle’s search blog, but I didn’t link it back to this post, because I’m not quite ready to be found by the spiders yet. Though, even providing this link out could start the process, because John might be running the Google Toolbar with privacy turned off, and look at his referrers. He also might have his log files or analytics reports findable by Google, leading a path here. Anyway, that just pushes me on with all the more urgency to my projects. Labels: Mike Levin
The 80/20 Rule and Nested Sub-projects
OK, I’m starting on these 2 projects, but I’ve got the documentation bug. These are two very mainstream projects, that would be of great use to the world at large. And I’m going to build them up from scratch, using nothing more than what’s already installed on most people’s desktop PCs. So, I want to document it with my baby-step technique, where I show the entire program as it develops on every page of the tutorial. It’s quite inefficient from a file standpoint. But if the CMS system makes it manageable, there’s really no harm. It is the Web after all, and you don’t have to kill trees for more pages. And if it makes the student’s experience better, it’s worth it. But that leads to a nested sub-project, and the issues of whether or not to do it. I am a big believer in the 80/20 rule that states you should plan to get 80% of the benefit of any endeavor from the first 20% of the work (there are other interpretations for tax purposes, etc.). So, a series of half-implemented projects has a net gain and still moves you forward. Nested sub-projects which plague many professions are the enemy of productivity and the 80/20 rule. Suddenly, you’re embarking on project after project before you even start on the first thing. I even wrote an 80/20 rule poem. You saw an example of me avoiding a nested sub-project pitfall when I decided to just go ahead with Blogger. I could have tried installing WordPad, had another server and database to deal with, a new system to learn, etc. I could have even written my own (which I have partially working). But by choosing Blogger, I could just forge ahead. And here I am the next day with many blog posts under my belt and standing at the edge of another potential pitfall, looking over the edge. Let me explain. I can already do baby-step tutorials using my CMS. The problem is that to make them really cool, you have to highlight what lines changed in the code from page to page. Manually inserting the markup is tedious, and defeats the purpose. It can and should be automated, and I already used a Windows derivative of the Unix diff program to get started. Its half-way implemented. I basically make a post using my home-grown blogging system, and it looks a the previous posts, finds the differences, and automatically inserts highlighting and strike-out code to show on the current post what changed from the previous post. It’s way-cool, and the foundation for something new and innovative on the Internet in its own right. I could easily imagine this site becoming as popular for the novel baby-step tutorials as it does for the MyLongTail app. Problem is, it’s only half-implemented, and I don’t know whether I should try to bang it out all the way before today’s main projects. Let’s evaluate. Ask the key questions… - Can you possibly imagine even more nested sub-projects? Or is it a clear one-off? Will you get caught in a recursive trap and a nightmare of maintenance overhead?
- Is it foundational, meaning that it will improve all the other work you do and start resulting in compounding returns? So, over time, do you really save time?
- Is there a better already existing way to do this?
- What are the corollary benefits, and do they outweigh the lost time?
- Are there urgent aspects of the projects you’re putting aside? What is the damage of delay?
- Is it really necessary for your primary objectives?
This should be a clear one-off project, because it is basically a rebound-action on a database insert. When the insert occurs, just do this quick processing. The auto-markup occurs. It is foundational, but only on the documentation side. If you consider documentation foundational. Its way different than other documentation systems in that it captures process, for better or for worse. It is much like the Disney work-in-progress pencil-sketch test animations that have more character than the finished product. It also gives you more ability to learn about the animation process than the finished product. It is rare to the point of non-existence in the programming world, because it takes too much time to document in this way, and reveals the many imperfections of the creative process (because it documents mistakes and all). The closes thing to this is Wiki revision tracking, and code version management software. Anyway, all this is to say, yes, I believe the work is foundational, because it is key to providing a rich documentation and tutorial experience on this site. This gets to the fact that I already made the decision to use my home-grown CMS system. With that decision made, I need to choose something that integrates well. And I actually already am choosing “the better way” to do this in tying in the Unix-derivative diff program. I could have attempted to actually program this from scratch, but this program gives me everything I need to parse a file and insert the code. I can focus on parsing and marking up instead of detecting differences. There are massive corollary benefits. It allows me clearer differentiation of what I document in Blogger, and what goes in the CMS (stream-of-consciousness goes in Blogger, and baby-step code tutorials go in CMS). The tutorials increase coolness factor and buzzworthiness the chances of getting this site written about, and eventually SlashDotted. That is not only a corollary benefit, but is a main objective. The project also clarifies my own thinking, making me code more carefully knowing that the non-proprietary parts are going to be public and under the scrutiny by other programmers. Yes, there is a very urgent aspect of the projects I’m putting aside. I want to document the very first search hits to ever occur on MyLongTail, and the very first GoogleBot visit. I may miss them. The site is already out there, and I’ve been blogging (but without pinging). Will the site be that much less interesting if I miss these key events? If I can really isolate the projects (all 3) down to a single day, will I be really jeopardizing it that much more? I don’t think so. So, all three projects should be done in one day. But one of those projects is really less important than the others. More on that soon. While not necessary to my primary objectives, it certainly does help the “erupt in buzz” objective. More tutorials mean more pages of real value to a broader audience, and more search optimized pages, and more pages using a unique and valuable tutorial technique that perhaps the buzz-brokers will recognize. It’s also worth noting that I already have the blogging bug, which is somewhat cutting into just getting the work done (its 11:22AM already). And having this system running will let me feed that blogging appetite while simultaneously actually doing the coding. So, while not necessary for my primary objectives, it highly reinforces them, and I will move ahead with the baby-step tutorial marker upper.
VBScript in a Web 2.0 World
Well, what about those 2 projects? To help this site erupt in buzz, I’m going to also make it into a tutorial site, focused on a unique brand of tutorials that I can’t get enough of: baby-step tutorials. Now for a bit of philosophy. I program in chisel-strike projects—projects I can conceive of and implement in a single day, while it stays consistent with the overall vision of the project. I was inspired by the way Michelangelo once described his work as revealing the sculpture that was already hidden in the stone. Every chisel-strike a master sculpture takes reveals more of the masterpiece contained within. It reaches a certain point where it’s clear what the sculpture is, and it becomes a joy to look at, and could already be put on display. That’s what all these “beta” sites that are in beta for years are about (in addition to reducing tech support liability). There’s no reason to wait for the pristine and polished finished product before you start getting the benefit. There’s lots of ways to describe this. I use a chisel-strike metaphor. In programming, there used to be a lot of talk of spiral methodologies replacing waterfalls. Recently, talk of agile methodology has come into vogue. Some would call it hacking. But whereas hacking in yesteryear resulted in a working app at the expense of long-term manageability, hacking today can very easily result in the same working app, but on top of a robust framework that “un-hacks” it. Ruby on Rails is an example of such a framework. But many of the chisel strike projects I start out with are going to be VBScript. That’s right. I’m building this thing is ASP Classic on IIS. I’m doing it knowing it will be on the Ruby programming language for the back-end scripts when I have the time, and the Ruby on Rails agile framework for the front-end user interface and applications. What? ROR is supposed to be so ridiculously simple that you can sit down, install it, and have written your first app in under an hour. It’s true, and I’ve done it. But several factors affected my decision to move forward with VBScript. First and foremost, I too am doing an extraction of an existing system (the way ROR was extracted from Basecamp). I don’t like my extraction as much, and I’ll never open source it. But it exists, and it’s my fastest path to implementation. Second, once I make the move to ROR, I think it will be time to break all my Microsoft dependencies, and get off of SQL Server. I love SQL Server, and think it’s tweaked-out in terms of transactions per seconds, self-optimization, and disaster prevention in a way that MySQL is not (yet). It is increasingly an acknowledged competitor to Oracle and DB/2. But scalability has a lot to do with cranking out multiple software instances of your system at little to no additional cost. That means being in the open source world. It will also lower operational cost and maximize profits. Yes, there are MS-arguments against this, but they don’t hold up over time as there are more and better means of supporting open source installations. And I don’t know Linux/Apache yet. So, no matter how simple ROR may be, I will be taking it as an all-or-nothing package. I don’t want to create a hybrid of keeping a Microsoft platform, but installing MySQL and Ruby. Even though it would be a great learning experience, it would slow my initial speed of deployment. The benefit for you as an audience is to see someone still doing viable Web 2.0 work on VBScript/ASP Classic, with a plan to move to Ruby on Rails, and then whatever tutorials I create during the transition. If my plan goes well, it should be a series of baby-step tutorials that will help anyone make the move.
Blogging, Continuity and Productivity
 One way a blog like this helps when designing a new Web 2.0 site is continuity of discussion. I’m working on this project primarily as a one-person show. I have some great backup in the programmers we have working for us at Connors back at the office, and I have my long-time partner in crime who helped with previous incarnations of the system. But this blog constitutes a real-time, ongoing discussion with myself, and lets me pick up where I left off smoothly. There was an article I read a few months ago about productivity in programmers. I forget exactly where, but I think it was when I was researching agile methodologies, and the author made the point that a single programmer with a clear vision of what he/she is trying to do can be something like 1000% more effective than a programmer working on a team. That is, one motivated programmer using agile development methodologies can do 10x more work than a counterpart working as part of a team where project management software, bureaucracy and meetings constantly corrode the hours spent to work accomplished ratio. It wasn’t Paul Graham who write this, but somehow I associate the concept with him, based on how it jived with the many articles I’ve read on his site. If I find the actual reference, I’ll post the link. The purpose of the blog posts in the morning is like winding the catapult. I should have clarity on the rest of the day. Yesterday, I made the MyLongTail site live. I essentially made the decision to develop this live online in stealth mode. This has the SEO advantage of letting the clock start ticking as soon as possible to let the domain age as far as the engines are concerned. The latest Google wisdom following the Jagger update is that a domain should be about a year old to overcome a negative weighting penalty. Most spam sites are newly registered domains. There’s some uncertainty about when the clock starts ticking—whether it’s when the domain is registered or when Google discovers it for the first time. GoogleBot is unlikely to discover the site until at lest one inbound link is established to it. But several PCs I use have the Google Toolbar with privacy turned off, so Google will know about the existence of these pages very soon (if not already). But I want this site to chronicle a complete and accurate history of the birth of a Web 2.0 site from an SEO point of view. So, today’s priority is to put the systems in place to track spider visits. These spider monitoring systems also starts a more advanced process of what this site is all about—collecting data that becomes intelligence that becomes action. MyLongTail is not going to advocate spider-watching, because that is a misappropriation of valuable time from the average marketing department’s point of view. I’m doing it because it’s of interest for this particular site. When was the first visit by GoogleBot? Which pages has it picked up? How much time went by before the first Google search hit occurred? Yes, this might be of casual interest to marketing departments that have too much time on their hands. But MyLongTail focuses on “what hits occurred recently” and “how can we use that to make more hits occur soon?” Much of the peripheral and pedantic details will be thrown in the trash to make the overall system more focused and efficient. Labels: Mike Levin
Double-whammy Logo Design
The last thing that I want to do today before I go home today (where my kittens who are not used to me being away so late will kill me) me is a unified header to glue together the CMS and the Blogger pages. A single graphical header going across all the pages will go a long way towards unifying the two systems (blog & CMS) and catalyzing my vision as to what the site is to become. Happily, I have a logo all ready. Rarely do I embark on graphics projects anymore, even though that is my training. I’m tired of the subjectivity of graphic design. Everyone is an expert, everything is subjective, and fashion rules. None-the-less, I dusted off my sketching skills and doodled out a design that I hope my old instructor, the master of ambigrams, John Langdon, who did the work in Dan Brown’s Angels & Demons, would be proud of. My logo is not an ambigram, but it uses the principles I learned in John’s typography class, of how the strongest logos often zero in on a letter that says something about the overall word, and exaggerates it just enough to turn it into a sort of onomatopoeia—a word that represents the meaning. Words like Bam and Sniff are onomatopoeias. It’s so much stronger than just adding the latest swoosh that is so prevalent in logos today. Ambigrams are double-whammy design, because they work for more than one reason, but it can be done without making it readable upside down. There was once a magazine named Family, which accomplished such design by dotting the “l” and a few other characters. Very effective. When I can, I try to make a logo work for 3 or 4 reasons. I think I nailed that here. First, you’ve got humor: what is “my long tail?” If that’s not an ice breaker, I don’t know what is. Second, you’ve got echoes of the ubiquitous logo that has been burned into all of our retinas: Google. I tried to make the placement of the “g” reminiscent of Google (although, it’s a wholly different typeface). Thirdly, I exaggerated the g so that its tail literally becomes the tail. I could go on, but 3 reasons it’s such a strong logo is enough. I’ve got it up on the site. OK, so I’ve uploaded the logo and put it at the top of both the CMS template and the Blogger template. I also took the step of unifying the styles from the two systems. That way, I won’t end up maintaining two sets of CSS, and I can just edit a single external linked file to tweak the overall look of the site without re-generating the static pages. It will also help enforce a unified look between CMS and blog.
Building Search-friendly From the Start
 OK, now it’s time to apply a graphical header across both Blogger and the main site. And it’s time to make some commitments to a CMS system for the main site. There are many CMS systems out there, and the last thing I’m going to do is go through the learning curve on even an easy one. Is a website a Web application or a bunch of HTML files? For manageability, it has to be thought of as a Web app, but for search optimization, it needs to be thought of as HTML files. Blogging software has long ago solved this by “outputting” static HTML files from their database. This has a plethora of advantages, including reducing server load (serving static HTML files is much easier than executing code). Even if your dynamic pages are masquerading as static HTML, you’ve got increased server load—now two-fold: first, from the invisible reformatting of the URL that takes place with the Mod_Rewrite technique, and second by executing code that probably queries a database, populates variables, then finally serves up the page. Static pages, while providing less customizability, are much better for high volume sites. I believe I’ll be using our own home-grown CMS system for the rest of the MyLongTail site. The back-end controls don’t have the features or the polish of other CMS systems, but I know it inside and out. It gives 100% uncompromising artistic control (unlike most CMS), and it creates pages that are perfectly optimized static HTML for search engines. And best of all, when things change on the Internet, I can just re-work the XSL transformations, and appease the search engine algorithms du jour, at least as far as internal link structure is concerned. Our home-grown CMS system was designed specifically with SEO in mind, and more particularly, with non-commitment to website architecture or technology decisions. Very advanced XSL queries “knit” the website together, very much the way blogging software can rebuild the static pages of a blog. But because we control that transformation. Anyway, I need to go through the steps I would take for any website using our CMS for SEO system. I will need at least one page on the site. From a scalability standpoint, my home grown system is great when the entire site is going to be HTML. But much of this site is going to be an application. So, while I’m starting it this way for expediency’s sake, I very well may switch over to Ruby on Rails for an SEO-friendly app site. Additionally, much of the application will be written with AJAX, which is inherently SEO-unfriendly. So as the site becomes more application-like and cooler and cooler, it will simultaneously be becoming less friendly to search engines. That’s part of the reason why the blog is so important (Blogger is inherently SEO-friendly). Blogging lets us roll out content in a friction-free environment. Anyone who has managed corporate websites knows what I mean when I say friction. Because I’m blogging from Microsoft Word, I can roll out content with almost no friction. But the content that becomes the navigational framework of the site will be from my home-grown CMS, which is also inherently search engine friendly. Together, the blog and the navigation pages will create a very competent placeholder, so it can start setting properly into the engines. Labels: Mike Levin
Adjusting the Blogger Template
 OK, I don’t want to get bogged down in blogging details. And I actually looked closely at moving to WordPress or even Ruby on Rails Typo, in order to sharpen my ROR skills. But even such a small step is not worth it at this point, because I have to start worrying about different servers and databases. Blogger is very competent, and it has the Microsoft Word plug-in. But one concession I am making to tweaking my blog environment is I’m stripping out all the CSS styles to see what it looks like bare bones. I’ve kept the special blogger code, but that too I’m going to take a close look at. I haven’t done the Blogger template customization chore myself directly in any significant way. I have done a few light touches to help my people at Connors to understand the SEO issues. I would have liked to have added the previous/next arrows featured in MovableType/TypePad and WordPress. That’s one of the keys to efficient SEO. Blogger offers the 10 most recent posts, which approximates the same effect. But when you look at the Google PageRank algorithm, there are definite differences in how the PR juice gets distributed internally within the site. The prev/next arrows prevents topic dilution for that set of links, but the 10-recent links accomplishes much the same effect in net. Blogger has an outage scheduled for 4:00 today (and it’s 3:55)—an unforeseen downside. But not a big deal, because I can save the HTML locally (which it technically already is, thanks to the FTP feature), and add the styles back in one at a time to understand what they’re doing. OK, I’m not a big CSS guy. Over the years, I’ve tended to use table structure to enforce page layout. The common wisdom has gone against this in recent years. Bare bones CSS can be marked up with div tags, which then can be converted into columns with some very light touch CSS. There is an awful lot of CSS instructions between the style tags of a default template, and in putting them back in one at a time to see what they do, I see that the heavy lifting is done in one little spot… @media all { #content { width:660px; margin: 0 auto; padding:0; text-align:left; } #main { width:410; float:left; } #sidebar { width:100%; float:none; } } And the blog magically acquires the 3-column look. Sheesh, it’s that easy! No wonder CSS is becoming so universally embraced. It’s hard to imagine going back to table code to accomplish the same thing. The next thing I’m going to do is alter the left over blogging code (after all the CSS was removed) to make a few of the basic SEO optimizations required to fix Blogger’s default templates. First and foremost, is the permalink anchor text. Most popular blogger templates ridiculously puts in the time of day that the post was made. Keep in mind, anchor text is enormously influential in search results. So, it should be nothing other than the same text that becomes the title tag, headline and file name of the permalink page. So, the line that reads… <p class="post-footer"> <em>posted by <$BlogItemAuthorNickname$> at <a href="<$BlogItemPermalinkUrl$>" title="permanent link"><$BlogItemDateTime$></a></em> <MainOrArchivePage><BlogItemCommentsEnabled> <a class="comment-link" href="<$BlogItemCommentCreate$>" <$BlogItemCommentFormOnclick$>><$BlogItemCommentCount$> comments</a> </BlogItemCommentsEnabled><BlogItemBacklinksEnabled> <a class="comment-link" href="<$BlogItemPermalinkUrl$>#links">links to this post</a> </BlogItemBacklinksEnabled> </MainOrArchivePage> <$BlogItemControl$> </p> …should be changed to… <p class="post-footer"> <em>posted by <$BlogItemAuthorNickname$> @ <$BlogItemDateTime$></em> <BlogItemCommentsEnabled> <a class="comment-link" href="<$BlogItemCommentCreate$>" <$BlogItemCommentFormOnclick$>><$BlogItemCommentCount$> comments</a> </BlogItemCommentsEnabled> <$BlogItemControl$> <br/>Link to: <a href="<$BlogItemPermalinkUrl$>" title="permanent link"><$BlogItemTitle$> permalink.</a></em> </p> And the final step that should be done in fixing the default Blogger template (not customizing) to add a line to the Previous Posts section linking back to the top of the blog. Blogger has this odd habit of letting you navigate deeper into the past by following the “previous 10 posts” links, but not forward in time. But now, I have a truly bare bones Blogger template. I’ve stripped it down to the essence, removing everything you might consider a Blogger “signature”. Not that I want to obliterate the fact that I’m using Blogger. But rather, I want to build it back up into the look that the MyLongTail site acquires so that the blogging section is indistinguishable from the rest of the site. The decisions I make regarding header graphics, column widths, etc. will be made now simultaneously to the Blogger template and whatever system I end up using for the main site. Labels: Mike Levin
Welcome to the MyLongTail Blog
 OK, it’s time to jump in head-first with MyLongTail. There’s been way too much thinking and office disruptions. It’s a downward spiral. You think that all you can do is the thought work, because the next distraction is imminent. The next distraction sucks you into office day-to-day work evermore. So, you do less actual work, in favor of thought work. Almost 3 weeks have passed, and a good amount of this project should be doable in just a couple of weeks. To my credit is the fact that we worked out some important issues of reducing load, based on several of Connors’ heavy traffic SEO clients. See, this system is already in place in its previous form, and turning it into a Web 2.0 offering is basically just an “extraction”, and some viral marketing. OK, perhaps its time to frame this as birth of a Web 2.0 company, and conduct it like performance art. Can you do it without giving away the farm? How much openness and candor is healthy, and how much makes it too easy for the next person to reproduce it? Wouldn’t it be something to turn this very journal entry into the first blog entry of a public-facing MyLongTail blog? Yes, that would be something. What steps would I take? The first would be getting the blog going, and making this the first post, ASAP. For marketing purposes, make the blogging portion of the site part of the mainstream blogosphere. Don’t get bogged down by creating your own blogging software or choosing hosted blogging software for fancy features. Get blogging fast, and get the full search optimization benefit. That means either Blogger. Why? For the benefit of the people reading this blog post, hosted solutions such as TypePad require that you dedicate a third-level domain to the cause, instead of a subdirectory of an existing www site. And with the local-install solutions, you have to go through the install and customization, and even then you often want it on a different server than the website you’re developing, getting right back to the subdirectory problem. Why is a subdirectory desirable for a blog? For search optimization, but we’ll get to that later. But it does show you the important point, that you will be seeing all the decisions that go through the head of an experienced search engine optimizer, as he creates a site from scratch. I’ve already got the MyLongTail domains, and have it preliminarily hosted on the corporate production servers. I’m going to construct a rudimentary template that will translate well to both a Blogger template, and a template in my systems. The idea is to keep it simple for now. We’re constructing a teaser to get the attention of a specific audience, and to start signing up early adopters, and to engage trend setters in conversation. I’ve done several projects like this in the past, and have always maintained a Web journal of this sort. The difference being, I have either kept the journal private, or my intended audience could not care less what I was doing, until it was over and they saw the impact. This will be different, because the public at large will be my audience, and I’ll interweave this with the Ruby on Rails, AJAX, Web 2.0, and Longtail movements. It’s almost a guaranteed success. Anyway, back to Blogger. Yes, Blogger. There are fewer cool plug-ins for Blogger, because it is hosted. And it doesn’t seem to be a Google priority, so features aren’t moving forward as fast as WordPress. But it is search-optimized, and easy and free. It has a Microsoft Word plug-in, which makes publishing ridiculously easy. And if things change radically, you can export it’s contents as XML, and transform the structure with XSL to bring it into any other system. So basically, there are no downsides to Blogger, and getting going is a 1-hour proposition (at most). So, Blogger is SEO-friendly, and it can be planted in the subdirectory of an existing site. I will use www.mylongtail.com/blog. I go into my Blogger account and create a new blog. I choose the minimal template. I publish a test post. I view it on mylongtail.blogspot.com, and it looks fine. I update the FTP settings and test, and the blog is in location using a default blogger template as planned. The one downside I now recall about blogger is the fact that it inserts the navigation bar at the top, obviously for viral marketing. My personal blog site, Mike-Levin.com, was apparently grandfathered in to when that nav bar could be turned off. I searched the Blogger controls for the way to turn it off, and some Googling shows that people are doing it with hacks these days. That’s well and good, but that b-navbar div actually inserts JavaScript. Ugh! I’m tempted to just write some quick blogging software myself, but I don’t want to chase that particular rabbit. Ready, fire, aim! Labels: Mike Levin
|
Spread The Word