Estimated time to crawl content

Estimated time to crawl content

Crawl testing focused on performing crawls of the three different types of content sources with and without latency and bandwidth constraints. However, due to the time involved in setting up, running, and reinitializing each test scenario, the tests were only performed for one bandwidth and latency combination for each content source. With this in mind, testing focused on the most likely worst-case scenario at 100 ms of latency and 10 Mbps bandwidth.

SharePoint site content crawl

The following table reports the estimated time to crawl content in SharePoint sites based on available bandwidth, latency, and the volume of content.

Bandwidth

1 GB

5 GB

25 GB

100 GB

500 GB

10 Mbps

Latency = none

Crawl rate = 467 MB/min

2 min

11 min

54 min

5 hr 30 min

17 hr 45 min

10 Mbps

Latency = 100 ms

Crawl rate = 330 MB/min

3 min

15 min

1 hr 15 min

5 hr

25 hr 15 min

 

File-share crawl

The following table reports the estimated time to crawl content in file shares based on available bandwidth, latency, and the volume of content.

Bandwidth

1 GB

5 GB

25 GB

100 GB

500 GB

10 Mbps

Latency = none

Crawl rate = 467 MB/min

12 sec

1 min

5 min

20 min

1 hr 30 min

10 Mbps

Latency = 100 ms

Crawl rate = 330 MB/min

2 min

9 min

45 min

3 hr

15 hr

 

HTTP crawl

The following table reports the estimated time to crawl content across HTTP sites based on available bandwidth, latency, and the volume of content.

Bandwidth

1 GB

5 GB

25 GB

100 GB

500 GB

10 Mbps

Latency = none

Crawl rate = 467 MB/min

79 min

6 hr 30 min

33 hr

132 hr

659 hr

10 Mbps

Latency = 100 ms

Crawl rate = 330 MB/min

3 hr

15 hr 15 min

76 hr 30 min

305 hr 40 min

1529 hr

 

Crawl testing focused on performing crawls of the three different types of content sources with and without latency and bandwidth constraints. However, due to the time involved in setting up, running, and reinitializing each test scenario, the tests were only performed for one bandwidth and latency combination for each content source. With this in mind, testing focused on the most likely worst-case scenario at 100 ms of latency and 10 Mbps bandwidth.

SharePoint site content crawl

The following table reports the estimated time to crawl content in SharePoint sites based on available bandwidth, latency, and the volume of content.

Bandwidth

1 GB

5 GB

25 GB

100 GB

500 GB

10 Mbps

Latency = none

Crawl rate = 467 MB/min

2 min

11 min

54 min

5 hr 30 min

17 hr 45 min

10 Mbps

Latency = 100 ms

Crawl rate = 330 MB/min

3 min

15 min

1 hr 15 min

5 hr

25 hr 15 min

 

File-share crawl

The following table reports the estimated time to crawl content in file shares based on available bandwidth, latency, and the volume of content.

Bandwidth

1 GB

5 GB

25 GB

100 GB

500 GB

10 Mbps

Latency = none

Crawl rate = 467 MB/min

12 sec

1 min

5 min

20 min

1 hr 30 min

10 Mbps

Latency = 100 ms

Crawl rate = 330 MB/min

2 min

9 min

45 min

3 hr

15 hr

 

HTTP crawl

The following table reports the estimated time to crawl content across HTTP sites based on available bandwidth, latency, and the volume of content.

Bandwidth

1 GB

5 GB

25 GB

100 GB

500 GB

10 Mbps

Latency = none

Crawl rate = 467 MB/min

79 min

6 hr 30 min

33 hr

132 hr

659 hr

10 Mbps

Latency = 100 ms

Crawl rate = 330 MB/min

3 hr

15 hr 15 min

76 hr 30 min

305 hr 40 min

1529 hr

 

 

This above information can be found in below TechNet article:

https://technet.microsoft.com/en-us/library/cc262952.aspx#section4

Comments

  • Anonymous
    February 11, 2015
    And do you know the size of temporary folders to crawl 500GB fileshare?