When Microsoft Office SharePoint Portal Server 2003 creates content indexes, it uses access accounts to crawl Web sites, servers, and network resources included in each content index. The permissions of the access account determine the success of the crawl.
When crawling sites based on Windows SharePoint Services, the crawling works best when the access account has administrative access to the SharePoint Team Services server farm. The access account should be a member of the SharePoint Administration group.
If the access account is not a member of the SharePoint Administration group, the server will crawl the site, but the content index will not be secure because all users will be able to access the documents. In addition, some data might not be included in the content index because the account might not be able to access all of the sites under a Windows SharePoint Services site.
When crawling sites based on SharePoint Team Services v1.0, the access account must be assigned to the Administrator role.
The default content access account must have Read permissions for the Web sites and servers being crawled. In addition, the account must have the permissions discussed previously to crawl sites based on Windows SharePoint Services and SharePoint Team Services.
When SharePoint Portal Server creates a content index for sites on the Internet, it first provides the default content access account. If this account has not been configured, SharePoint Portal Server provides the anonymous account. If you are using a proxy server, the account that you use to access the Internet must have permissions on the proxy server in order to create a content index of sites on the Internet. Without permissions, you can crawl only content on your intranet.
Typically, you specify the default content access account when you install SharePoint Portal Server. This account applies to all content sources. If you want to configure an account for a specific content source, you can define a site or path rule for that content source and associate an account with it. If no site access account is specified for a content source, SharePoint Portal Server uses the default content access account when accessing content in that content source.
SharePoint Portal Server typically uses NTLM authentication first, and if that is not available, then it uses Basic authentication. Credentials passed when using Basic authentication are not secure.
SharePoint Portal Server uses the configuration database administration account when connecting to the configuration database and when propagating indexes from index management servers to search servers. This account must be a member of the local Administrator group on the search server.
The index management server crawls content to include it in a content index. The search server executes queries. You propagate a content index from the index management server to another server to free resources for other processes on the destination server. The destination server can be a dedicated search server, or it can be a server that is also running other components.
You can dedicate one server to creating and updating content indexes and another server to processing queries. You create the content index on the first server (the index management server) and propagate the content index to the second server (the search server). You limit the resource-intensive processes to the server dedicated to indexing without affecting the performance of your server dedicated to searching.