Site authored by: Sean Bordner | Home | About Me |
SharePoint 2007 SEO Links: Why SEO | SEO Basics | Theme relevant content | WCM Sites & CMS Sites | SiteMaps | Fresh Content | SharePoint WCM SEO Quick Start | WCM Content Types & Document Libraries | IA | Web Analytics | SharePoint Users Group | Social Networking with SharePoint |
Check out my SharePoint Blog.
MOSS SEO explains how to optimize your public facing SharePoint 2007 WCM based web site for internet search engines. SharePoint 2007 running in “CMS” mode might just be the perfect opportunity to fully flex true Search Engine Optimization. All the necessary components are in place to help you finally achieve top search engine rankings (a rare find when talking about Content Management System driven web sites)! MOSS SEO will get your MOSS site on top where it needs to be.
The essence of this site is to share MOSS SEO techniques for implementing sound SEO. First lets touch base on some basics:
As you may have already learned the hard way, there is not much point in having the best prices or services and all around coolest website on Earth if nobody knows it’s there! A fundamental oversight can easily be avoided by taking the time to ensure your website is search engine friendly. I’ll talk more about that later in this article. Right now, allow me to share a college story with you. I had an English Writing class and the Professor was an older gentlemen. He also served as the local newspaper Editor. He was a straight forward man with only a few rules to always follow. The most important rule was very simple; no more then two clichés per paper. He hated clichés and made a great effort to get this across to his students. I detected this ‘not so subtle’ deeply rooted hatred of his, and attempted to submit papers without a single cliché, every time. I would sometimes be shocked to find a red circle around a cliché which had inadvertently found its way past my fingers to the keyboard. The Professor would do this to flag them, making for an easier final tally of clichés, ensuring he would miss none. He would even circle ‘carefully disguised clichés’ which I had word-smithed to say the same thing, using different words (this technique failed me).
I often attempt to put myself in the search engine spider shoes when optimizing a web page. A search engine spider’s mission is simple; it goes out there and follows every link it encounters and reports back with its findings. Spiders take the path of least resistance and are not interested in anything that slows them down. That’s being overly kind, actually, spiders hate things that slow them down or attempt to deceive them. It occurred to me they are not unlike my old Professor, but perhaps a bit less lenient. A red circle on a web page translates to ranking loss, which translates to traffic loss, which translates to… you get the point. At least my old Professor would clearly identify the “errors of my ways’ (hope he’s not reading this). A search engine spider is a passive-aggressive creature. It will leave its footprints all over a websites log files, allowing us to know it has in fact ‘graded’ the pages. Nothing will be said, no red circles to ponder and correct, no grades to reflect upon, nothing but little spider footprints in the web logs. These footprints are saying “hey, I’m a spider and I just crawled your website, thanks for having me over, I’ll come again sometime, crawl ya later…” That’s what you get to see – do not forget about the “aggressive” nature of a passive-aggressive. Meanwhile, the spider is back at search engine headquarters hanging out around the water cooler trading horror stories with all his little spider buddies. “Man, this one website had SO many nested tables my head was spinning. I never did find any relevant content, just HTML, blah, blah, blah… I was nearly lost three times, so I just left – geesh!” This ‘spider smack-talking’ gets recorded and is used in determining how your website is ranked (the position in which your website is returned in the search results).
I am going to make this as straight forward as my old Professor would have: If the search engines do not like what they see, your website will not be ranked well in search results. If your website is not ranked well in the search engine results, nobody will know about your website. Less then 10% of all Google.com users ever see past page 3 of search results. Search Engine Optimization (SEO) should be taken very seriously, after all, exactly what IS the point of a website if nobody finds it? There are some SEO basics, which unfortunately are often overlooked and/or underestimated. Search engine spiders (robots) visit websites, and report their findings back to the search engine database. These spiders are looking for basic information about your website, which can be found in the code (example; page title, page description, page keywords, etc…). Much more goes into search engine ranking algorithms, (keyword density, domain registration properties, inbound links, consistency, etc…), but the basics play a significant roll and cannot be overlooked by serious websites. These basics can be programmatically handled with SharePoint 2007 (MOSS), leaving more time to be spent on content and other very important SEO criterion. More on automating the basics later…
A word of caution; do not try to ‘trick’ the search engines. Attempting to fool a search engine can get your website blacklisted, therefore, fully nullifying any SEO efforts performed in the past and future. This is a counterproductive practice which will do more harm than good. So, what do I mean when I say ‘trick’ or ‘fool’ the search engines? You know exactly what I mean! When ideas of placing a bunch of keywords in a real small font size and color matching the pages background so the spiders see it in the code, but users don’t see it – that’s what I mean. Or, word-smithing your page to the point of absurdity, so the keyword density will be high for a specific word or phrase – that’s called ‘keyword stuffing’. Or, using pages that serve no true purpose other than to rank well with search engines – called ‘doorway pages’. The list goes on and on, but the one common denominator is attempting to deceive the search engines.
What NOT to do
Many things factor into overall ranking of a web site. Some of the more obvious ones include: site availability, amount of content, keyword density, etc… But what about the things that harm your sites rankings which you may be doing and should NOT be? This list would include: keyword stuffing, trying to “fool” the search engines with doorway pages, or hidden text. Are you using the exact same html title on every page of your site? Are you submitting your site to multiple search engines more than once – or worse yet; allowing some online “SEO” company to submit your site to multiple search engines on your behalf? All the aforementioned things are very common mistakes that may negatively affect your web sites rankings. These are not “trade secrets” in fact the major players all have information available online which explains how to improve your web sites rankings. Some less obvious things which could potentially harm rankings include: domain forwarding, content syndication, domain registration term, all those super-cool graphics and page layouts that require mind twisting table nesting or worse, complex URL’s (query strings, for example: www.mossseo.com?querystring=1&anotherquerystring=2&soOn=50). You can learn a lot about what not to do by reading what the search engine folks themselves advise you to do and not do. The most important thing to remember is do NOT try to fool the search engines.
WCM SEO - Theme Relevant Content – Part 1
It’s worth mentioning that we are not talking about the SharePoint 2007 theme which allows you to control the look/feel of your site via styling. We are talking about the overall subject matter of your site. In terms of a SharePoint 2007 WCM site, theme relevant content refers to ensuring your text is in some way clearly supporting the essence (or point) of the page it resides on. The page itself in-turn needs to reinforce the point of the site. Your content should be targeting a human audience, not search engines. This is important because every time somebody clicks on your URL from a search engine results page, they have just cast a vote for your site. If they like what they see they will hopefully return to the search engine, re-enter their search criteria and click your URL again – casting yet another vote for your site.
A SharePoint 2007 WCM site gives you a great deal of flexibility in terms of Information Architecture (IA). To this end, it might be a good time to revisit your IA while keeping content relevancy in mind. A very easy way to view your existing IA in SharePoint 2007 is to click on “Site Actions” and then “View Content and Structure”. This will display an expandable view of your site collection which you can then easily use to reorganize your content. For example, if you want to move an entire section (or sub site) out of the main stream of the site and into your “Archive” site as its own sub site: Simply place a check in the box next to the site you want to move, drop down the “Actions” tab and click “Move”. This will bring up a locations box which you use to identify where you wish to move the site, in this example its “Archives”. Place a check next to “Archives” and click “Move”. SharePoint 2007 will do the rest. This approach gives you the ability to move dated content out of the main flow of things, while simultaneously retaining its existence in an appropriate location. You can configure your SharePoint 2007 WCM site’s search engine to crawl or not crawl this archived content. If you want to crawl it yet keep it separated, simply configure a separate search scope called “Archive”. This will allow your site users to decide if they want search results pulling from your archived content. More on the cool things you can do with the SharePoint 2007 search engine later.
WCM SEO - Theme Relevant Content – Part 2
With SharePoint 2007 Web Content Management (WCM) also comes enterprise grade content classification functionality. Arranging content on a page is handled using page layouts. A page layout dictates what type of content goes on a page, and how it is arranged on the page. Page layouts help content authors keep the page focused and clean. The page presentation is the combination of the sites master page (which dictates the structure of the sites underlining framework, including navigation and other entities which are common across the site); and the page layout (which dictates the structure of content on a page).
Theme relevant content can be carefully managed using these WCM components to your advantage, but first you will need to fully understand how all this works. A SharePoint WCM site offers another huge advantage called a “site column”. Site columns can be thought of as fields in a database. You can create a site column called “Environmental Issue” and set it as a “choice” data type. Then provide the options for authors to select (example, Global Warming, Greenhouse Gases, Erosion, etc…). This also provides the mechanism of tagging your content. This can be a dropdown or a multi-select using checkboxes. Create as many site columns as needed.
Now that we have our site columns created, we need to exploit another powerful component of a SharePoint WCM site called “content types”. Content types are simply a collection of site columns. Content types can be used for many things in SharePoint 2007, including page layouts (see above). You can create a content type called “Environmental Article” and add the appropriate site columns. For example, the “Environmental Article Content” content type might contain the following site columns: Title, Author, Date, Environmental Issue, and Article Body. These are the basic ingredients you wish for all articles published on your WCM site to contain.
Now it’s time to bring it all together. Remember, a page layout dictates the type of content that goes on a page. This is accomplished by basing the page layout on… you guessed it… a “content type”. So keeping with this example, you would now create a new page layout called “Environmental Article Page”. You base the page layout on the content type called “Environment Article Content”.
WCM authors can now spent more time publishing environmental articles to your site and less time bogging down IT with requests. Even better, these articles are all formatted and displayed in a consistent, organized, and theme relevant fashion.
Quality in-bound links
About SharePoint 2007 Web Content Management Pages
You might notice the absence of the “description” tag and the “keywords” tag. We will cover options available to reclaim these tags and populate them accordingly.
Time (guilty until proven innocent)
Although SharePoint 2007 Web Content Management (WCM) is search engine neutral, if not search engine friendly, your content authors still need to be SEO aware. Many CMS solutions offer the ability for multiple authors to manage content, this should not be anything new to you. However, when considering search engine optimization (SEO) efforts, you should consider the advice freely given to web masters by Google.
Many CMS solutions employ dynamic URL’s. These are complex URL query strings (for example: page.aspx?id=5§ion=aboutus). It’s not a matter of the robot/spider being unable to crawl such URL’s but more a matter of overloading the search engine. Search engines prefer pages which appear to be hand coded and live on a non-dynamic URL (example: page.aspx). Time has proven such pages to be less likely to be computer generated and more likely to have a point and serve a real purpose. In most cases, SharePoint 2007 does not use what search engines consider dynamic URL’s.
The ability to easily link to your pages using a search engine friendly navigation is built in to SharePoint 2007 WCM. This means not only will your users be able to easily navigate throughout your entire site, but the search engine robots will also be able to effectively crawl your entire site. You can accomplish this by naming your sites and pages appropriately. SharePoint 2007 can automatically keep your navigation current based on your sites “Navigation” settings. Naming your sites and pages appropriately builds anchor relevance. Any link on your site should be descriptive of its destination. For example, if you wish to add a link to your “Contact Us” page or site, the link should read something like “…more information about how to reach us can be found in our Contact Us section” (with the “Contact Us” text as the actual hyperlink. Try to avoid pitfalls like “click here” as the hyperlink – “click here” is not relevant to anything and also not descriptive of its destination. Therefore, giving your pages and sites a descriptive and accurate name will prevent links from appearing in your navigation which are non-relevant.
A SiteMap is a way to describe the pages of your site to a search engine. It also provides the mechanism for letting search engines know when your pages have been added, removed or otherwise modified. A SiteMap file is an XML formatted file containing an entry (or item) for each page of your site. Each item also contains the date/time the page was last modified. The exact formatting of the XML file varies depending on which search engine it is tailored for; however, very soon we should have a unified “SiteMaps protocol” to be used for the big 3 (Google, Yahoo!, and Microsoft). This means you will have a single SiteMap XML file to maintain and all 3 search engines will pull from it! More about SiteMaps can be learned from the official website: http://www.sitemaps.org.
So how does MOSS SEO deal with SiteMaps? You will still need to tell each of the big 3 about your SiteMap file by registering it and verifying you are who you say you are (this is a painless process taking about 5 minutes). This will only need to be done once. Prior to registering your SiteMap, you will need to actually have a SiteMap file. Here’s where MOSS SEO will save you big time. Instead of manually updating your SiteMap file every time something on your MOSS site changes, you should simply add on a custom control that does this for you. It should automatically update your SiteMap file when a public facing page is published to a major version.
Update: Tim Dobrinski and Chris Prime have released a Version 2 of the SharePoint 2007 SiteMap Generator. This version can be installed as a SharePoint 2007 Feature! The SiteMap Generator automates the process of keeping your Google/Yahoo/LIVE Search SiteMap XML file current. The bits and the source code can be found on Tim’s blog.
If you are using SharePoint 2007 as a public facing web site in “CMS” mode (in other words; basing your site on the “Publishing with workflow” template), than you are very well positioned to meet the “fresh content” requirements. Do not underestimate the importance of this one, fresh content is very likely the single most important thing you can do. So how can MOSS SEO help with the fresh content requirement? Take advantage of the MOSS workflow, it’s extremely powerful and can facilitate all critical steps involved with generating fresh content.
Use workflow to drive your fresh content requirements. Assign sections of your site to content authors who are subject matter experts for that area. Set up SharePoint 2007 Workflow on your pages to require approval prior to publishing into a major version. This will allow the appropriate people to review/approve content prior to going live. Furthermore, the workflow can be as simple or as complicated as it needs to be, based on your requirements. For example; you may require all content to be approved by three or more people from your Legal department prior to being published. Lets also say that all three people need to give the green light before it’s considered approved. You can setup a single SharePoint 2007 Workflow to run a single page in parallel mode across all three approvers at the same time. When the final approver gives the thumbs up, the page can be published.
You can also use SharePoint 2007 Workflow to control content expiration policies. Here’s a MOSS SEO tip: Do not totally remove “expired content” from your site if it can possibly still benefit users. You should move it into an “archives” section so that it is at least still available and not wasted. After all, this content has been carefully crafted, reviewed, refined, approved and published – it disserves to stick around and it’s already paid for. Retention policies can be applied to drive expired content through it’s own special workflow process which ultimately lands the content in it’s now home in archives. Unless you have a very good reason for deliberately trashing content (which sounds a bit shady to me), you should consider keeping it around.
So you have a sweet SharePoint WCM site and now you’re ready to optimize it?
Here are the high level steps involved with ensuring your WCM site is ready for the internet search engines.
Another powerful use of SharePoint 2007 WCM content types is using them with document libraries. This combination gives you the ability to further organize your content into a theme relevant structure which is vital for quality SEO efforts. How? For starters, you can specify meta-data that will be tied to a specific document. Furthermore, you can re-use this across the entire SharePoint WCM site. If you need to add or remove a site column from your content type, your changes “can” be rippled down to all document libraries using the affected content type. I said “can” because changes do not have to be reflected across the board, this is an option available during the content type modification process.
Here’s the scenario: Your SharePoint WCM site has multiple document libraries with multiple contributors. You want to ensure that existing document libraries all have at least a handful of columns (meta-data) which will help organize information (or content) now and in the future. And lets also say you want any new document libraries to contain these same columns. You will need to start by creating the required site columns. A site column is a type of data which is available to the entire site collection. Think of it as a database field. It can be a simple text field called “Name” or a dropdown list of pre-specified options, number, multiple lines of text, etc… Site columns are created from the Site Settings page, under “Site Columns”. Note: when creating a site column, you will need to provide a “Group”. This is just a way of organizing your site columns so when you return some day it will make sense. You can create a group if you don’t want to place it under “custom”. I would encourage you to create a descriptive group here to help keep track of the original purpose of your site columns – but you don’t have to, if effects nothing.
The next step is to create the content type. A content type is a collection of site columns. You create content types from the “Site Settings” page under “Content Types”. During this process, you will basically specify why kind of content type you are creating. In other words, for use with pages, documents, etc… Then you just add the site columns you want the new content type to contain.
Lastly, it’s time to create the document library to your liking and then “clone it” by creating a template from it. You could just as easily modify an existing document library. To create a new document library, go to “Site Actions” / “View all site content” / “Create”. Then click on “Document Library”. From here just complete the form. To modify an existing document library, navigate to the actual document library (not a page that has a view of it, you will need to actually go to where it lives). You are in the right place if you see at the top of the document library list “Actions” and “Settings”. Drop down “Settings” select “Document Library Settings”. On the settings page of the document library, you will either see an existing content type section or you won’t. If you don’t see the content type section, the document library is not already associated with a content type. Now to add a content type to the document library: On the same settings page of the document library, click on “Advanced Settings”. From this screen, you will see a section for content types. This is how you associate a content type with a document library. Finally, when you have the document library the way you want it, return to the settings page for the document library and create a template. This template will be made available to users with permission to create document libraries. Give it a meaningful name and you are done. Note: the original out of the box document library template is still available, but can be removed if you don’t want users to even have that option.
These days, if your site does not have its own search engine, it’s almost insulting to your users. The OOTB SharePoint WCM Search Engine gives your site robust and highly flexible search functionality which will be noticed by your users. You can not only crawl your own site, but as many external sites as you wish. Search results are ranked and displayed based on relevancy. Permissions are also respected on search results pages via built in security trimming. This means if you have content which is not open to the anonymous “Joe Public” user, Joe Public will not even see this protected content in the search results, regardless of how relevant it is.
You cannot muck with how the search engine ranks results, but you can rest assured that at the end of the day, unless you happen to write search engine algorithms for a living, the SharePoint Search Engine algorithm will get the job done far better than anything you come up with. Don’t worry, there is a way for you to return specific results for specific search terms, called “Best Bets”. Best bets are very much like Google Sponsored Links. But rather then bidding for the top spot, you simply specify the relationship between the desired search term and the desired URL to return. Best bets will appear first in the list of results.
Wild card searches and Word Stemming are NOT the same thing. Word Stemming refers to returning inflectional variants of the root word. For example, “jumped” and “jumping” are both stemmed from the root word (verb) “jump”. However, wild card searches refers to using a symbol (typically *) to represent ANY character or string of characters. For example, the search term using a wild card “jump*” might return “jumpdrive” or “jumpville”. Pretty big difference! It’s worth noting that most languages separate words using whitespace, but not all (East Asian languages). Word Stemming is clearly more natural language based and is supported in SharePoint 2007 Search. Wild card searches require a custom web part and a considerable amount of careful consideration as it could undoubtedly increase the size of your search index.
Word Stemming is turned off by default for the English language in MOSS. You can enable word stemming on the search results page:
Word Stemming is a fascinating topic relating a word form to its base form and other related word forms (called morphological processing). Word Stemming explained: http://blogs.msdn.com/miketag/archive/2006/12/27/moss-search-word-stemming-part-1.aspx
Do not exclude SEO considerations when hammering out your SharePoint 2007 WCM site IA. Keeping people coming back to your WCM site is a product of a sound SharePoint SEO strategy. Careful consideration should be applied to overall look and feel of your site in order to address the needs of your users. To complicate matters, most sites target multiple audiences with very specific needs. Putting yourself in the computer chair of your many users can quickly become an inescapable mind trap unless approached methodically while always keeping business objectives in mind.
While there are many different methods of achieving a good look and feel which is intuitive and useful; not many even consider search engine optimization. Which page is ultimately going to carry more weight if all other things are equal: (A) a page located on or near the root URL, or ( B) a page deep linked five levels down from the root URL? Answer: (A). Why? If you put a page on the root of your site it must be pretty important to you, therefore important to the search engine. An equally relevant page located on another site, but five levels deep, just lost this race.
The importance of a solid IA and how it directly impacts your SharePoint SEO efforts is how users respond to your site once they arrive. Do users immediately hit the back button and do another search (or worse, click on another site)? This action tells the search engine exactly what the user thought of your site. You want users to get to what they want immediately or they are gone. Preferably users land on the precise page they are looking for upon entering your site. Remember, not all users will enter your SharePoint WCM site from its home page.
When talking about the overall look and feel of your SharePoint WCM site, you are talking about two main things: Information architecture and taxonomy. Information architecture is focused mainly on Web content as building blocks to be fit into a site's visual design and navigation scheme. In SharePoint, you are talking about sites, sub-sites and the pages contained therein. How these things will be arranged and found in order to meet the needs of your users equals IA.
Taxonomies are often created to describe categories and subcategories of topics found on your SharePoint Web site. Your SharePoint WCM site really pulls ahead of other CMS solutions in the taxonomy arena. SharePoint 2007 uses “site columns” to categorize content. You have full control over site columns and the rewards of finally being able to “tag” your content spill over into your SEO efforts.
Where is your traffic coming from? What time of day? How are they finding your site and just as important; how are they NOT finding your site when they would benefit from its content?
Tracking what your WCM site users are actually doing should be very near the top of your priorities; perhaps falling directly below keeping your site up and accessible. The very common concept of simply assuming your site structure is intuitive and easily navigatable because it makes sense to you is tragically flawed. Knowing what time of year users are looking for certain content can quickly be used for increasing conversion rates. Identifying places of your WCM site that most users exit from also has obvious benefits.
SharePoint 2007 offers many ways of keeping track of this important data. MOSS out of the box usage reports provide a solid platform for basic web analytics. You will first need to enable this functionality from Central Admin, SSP and the site collection. Once enabled, you can view the sites usage reports from Site Actions / Site Settings / All Site Settings. Site Collection Administrators can further build upon usage analysis reports by creating “audits”.
Audits allow you to build reports on things like “how to tell who has downloaded files from a document library”. These audits can be setup to monitor practically all aspects of a file. Audit files are compiled into an xml based file format which open and display quite nicely using the latest version of MS Excel.
Challenges Facing New Domains
The views expressed on this site are solely the views of Sean Bordner.
MicrosoftTM SharePointTM are certified trademarks of Microsoft Corp.