Description: This test use this page as the start_url, but the active value will be set to 0 (not active).
Set the following columns with these values. If not listed leave as blank or NULL
Column Name | Column Value |
---|---|
description | inactive |
start_url | http://www.swayward.com/index.html |
active | 0 |
Description: This will test various types ip IPv4 and IPv6 addreses, valid and invalid. NOTE: IPv6 is not active by default.
Set the following columns with these values. If not listed leave as blank or NULL
Column Name | Column Value |
---|---|
description | test_two |
start_url | http://www.swayward.com/test_two/index.html |
active | 1 |
Description: This will test the "default link patterns". if Ignore Patterns is TRUE then links not matching the pattern WILL be followed anyway.
Set the following columns with these values. If not listed leave as blank or NULL
Column Name | Column Value |
---|---|
description | Test three |
start_url | http://www.swayward.com/test_three/index.html |
active | 1 |
use_default_link_ignore_patterns | 1 |
link_must_contain | www.swayward.com/test_three |
max_pages_fetch | 5 |
Description: This will test the "default link patterns".if Ignore Patterns is FALSE then links not matching the pattern WILL NOT be followed anyway.
Set the following columns with these values. If not listed leave as blank or NULL
Column Name | Column Value |
---|---|
description | Test Four |
start_url | http://www.swayward.com/test_four/index.html |
active | 1 |
use_default_link_ignore_patterns | 0 |
link_must_contain | www.swayward.com/test_four |
Description: This will test will ignore IPv4 IP Addresses
Set the following columns with these values. If not listed leave as blank or NULL
Column Name | Column Value |
---|---|
description | test.five |
start_url | http://www.swayward.com/test_five/index.html |
active | 1 |
ipv4_enabled | 0 |
Description: This will test will ignore IPv6 IP Addresses
Set the following columns with these values. If not listed leave as blank or NULL
Column Name | Column Value |
---|---|
description | test 6 |
start_url | http://www.swayward.com/test_six/index.html |
active | 1 |
ipv6_enabled | 0 |
Description: This will test will test Max Number of Pages(Links) to be fetched
Set the following columns with these values. If not listed leave as blank or NULL
Column Name | Column Value |
---|---|
description | 7 test |
start_url | http://www.swayward.com/test_seven/index.html |
active | 1 |
max_pages_fetched | 2 |
Description: This will test Obey Robots txt file (robots.txt) is TRUE. No IP Addresses should be recorded.
See robotstxt.org NOTE: obey_robots_txt is TRUE by default.
Set the following columns with these values. If not listed leave as blank or NULL
Column Name | Column Value |
---|---|
description | 8A obey robots |
start_url | http://www.swayward.com/test_eight_a/index.html |
active | 1 |
obey_robots_txt | 1 |
Description: This will test DISobey Robots txt file (robots.txt) is TRUE. Even though user agent is blocked by robots.txt IP Addresses will be recoded.
See robotstxt.org NOTE: obey_robots_txt is TRUE by default.
Set the following columns with these values. If not listed leave as blank or NULL
Column Name | Column Value |
---|---|
description | 8B disobey robots |
start_url | http://www.swayward.com/test_eight_b/index.html |
active | 1 |
obey_robots_txt | 0 |
Description: This will test the User Agent String sent. The Robots txt file (robots.txt) will allow the dafult of crawler4j but will disallow stupidHeadBot. NO IP Addresses should be recorded.
See robotstxt.org NOTE: obey_robots_txt is TRUE by default.
Set the following columns with these values. If not listed leave as blank or NULL
Column Name | Column Value |
---|---|
description | 8c agent string |
start_url | http://www.swayward.com/test_eight_c/index.html |
active | 1 |
user_agent_string | stupidheadbot |
Description: This will test the max page size. If the page is larger than what is specified it should not capture IPs
Set the following columns with these values. If not listed leave as blank or NULL
Column Name | Column Value |
---|---|
description | Max page size |
start_url | http://www.swayward.com/test_nine/index.html |
active | 1 |
max_page_size_bytes | 500 |