Microsoft Office SharePoint Portal Server 2003
HomeBackForwardPrint

About Content Sources

About Content Sources

A content source is a starting point that Microsoft Office SharePoint Portal Server 2003 uses to create an index of information stored in a particular location. This content can be located on the same server, on another server on your intranet, or on the Internet. After SharePoint Portal Server includes these documents in the index, they are available for users to search for and view on the portal site. Examples of content sources include Web sites, file systems, other SharePoint Portal Server computers, Windows SharePoint Services sites, and Lotus Notes databases.

The Site Directory is the easiest way to add content to the portal site for searching. When a user adds a site, they have the option to include its contents in search results. A search administrator can have sites automatically approved for searching or can manage approval for each site. After approval, a site is included in the index and its contents appear in search results. However, content sources offer more control over what is searched.

SharePoint Portal Server can crawl the following types of content sources:

Some content source types, such as Exchange Public Folders and Lotus Notes databases, cannot be included in the Site Directory. You can define content sources for content that is not included in the Site Directory or that requires a special update schedule.

A special type of content source is the "This Portal" content source. As a system content source, you cannot delete it. It controls indexing of all of the portal site internal content.

Security Consideration

Most content source types that ship with Microsoft Office SharePoint Portal Server 2003 have custom protocol handlers that enable SharePoint Portal Server to determine which users have rights to access documents.

The exception to this is the content source for Web pages or Web sites. When crawling a page or site that uses the HTTP or HTTPS protocol, SharePoint Portal Server cannot determine which users can access documents. If HTTPS or other restricted HTTP content is successfully crawled (that is, if the crawling account has access to the content), the content, including a document summary, is returned in search results. Users may see results for documents that they do not have rights to access. These users will be prompted to enter credentials if they click the results for which they do not have access.

This exception does not apply to the following content that uses the HTTP or HTTPS protocol:

These crawls, although using the HTTP or HTTPS protocols, are able to determine which users can access documents so that content is not exposed to unauthorized users in search results.

Related Topics

Adding a Content Source
Editing a Content Source
Deleting a Content Source
Starting a Full Update of a Content Source
Starting an Incremental Update of a Content Source
Stopping an Update of a Content Source
Viewing the Gatherer Log for a Content Source
Configuring the Lotus Notes Protocol Handler
About the site directory
©2003 Microsoft Corporation. All rights reserved.