SourceOptionsWebCrawl (ibm-watson 8.6.2 API)

java.lang.Object
- com.ibm.cloud.sdk.core.service.model.GenericModel
- - com.ibm.watson.discovery.v1.model.SourceOptionsWebCrawl

All Implemented Interfaces:

com.ibm.cloud.sdk.core.service.model.ObjectModel
```
public class SourceOptionsWebCrawl
extends com.ibm.cloud.sdk.core.service.model.GenericModel
```
Object defining which URL to crawl and how to crawl it.

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`static class`	`SourceOptionsWebCrawl.Builder` Builder.
`static interface`	`SourceOptionsWebCrawl.CrawlSpeed` The number of concurrent URLs to fetch.

Field Summary

Fields
Modifier and Type	Field and Description
`protected java.lang.Boolean`	`allowUntrustedCertificate`
`protected java.util.List<java.lang.String>`	`blacklist`
`protected java.lang.String`	`crawlSpeed`
`protected java.lang.Boolean`	`limitToStartingHosts`
`protected java.lang.Long`	`maximumHops`
`protected java.lang.Boolean`	`overrideRobotsTxt`
`protected java.lang.Long`	`requestTimeout`
`protected java.lang.String`	`url`

Constructor Summary

Constructors
Modifier Constructor and Description

protected SourceOptionsWebCrawl(SourceOptionsWebCrawl.Builder builder)

Constructors
Modifier	Constructor and Description
`protected`	`SourceOptionsWebCrawl(SourceOptionsWebCrawl.Builder builder)`

Method Summary

Methods
Modifier and Type	Method and Description
`java.lang.Boolean`	`allowUntrustedCertificate()` Gets the allowUntrustedCertificate.
`java.util.List<java.lang.String>`	`blacklist()` Gets the blacklist.
`java.lang.String`	`crawlSpeed()` Gets the crawlSpeed.
`java.lang.Boolean`	`limitToStartingHosts()` Gets the limitToStartingHosts.
`java.lang.Long`	`maximumHops()` Gets the maximumHops.
`SourceOptionsWebCrawl.Builder`	`newBuilder()` New builder.
`java.lang.Boolean`	`overrideRobotsTxt()` Gets the overrideRobotsTxt.
`java.lang.Long`	`requestTimeout()` Gets the requestTimeout.
`java.lang.String`	`url()` Gets the url.

Methods inherited from class com.ibm.cloud.sdk.core.service.model.GenericModel
equals, hashCode, toString

Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait

- Field Detail
  - url
```
protected java.lang.String url
```
  - limitToStartingHosts
```
@SerializedName(value="limit_to_starting_hosts")
protected java.lang.Boolean limitToStartingHosts
```
  - crawlSpeed
```
@SerializedName(value="crawl_speed")
protected java.lang.String crawlSpeed
```
  - allowUntrustedCertificate
```
@SerializedName(value="allow_untrusted_certificate")
protected java.lang.Boolean allowUntrustedCertificate
```
  - maximumHops
```
@SerializedName(value="maximum_hops")
protected java.lang.Long maximumHops
```
  - requestTimeout
```
@SerializedName(value="request_timeout")
protected java.lang.Long requestTimeout
```
  - overrideRobotsTxt
```
@SerializedName(value="override_robots_txt")
protected java.lang.Boolean overrideRobotsTxt
```
  - blacklist
```
protected java.util.List<java.lang.String> blacklist
```
- Constructor Detail
  - SourceOptionsWebCrawl
```
protected SourceOptionsWebCrawl(SourceOptionsWebCrawl.Builder builder)
```
- Method Detail
  - newBuilder
```
public SourceOptionsWebCrawl.Builder newBuilder()
```
    New builder.
    
    Returns:
    a SourceOptionsWebCrawl builder
  - url
```
public java.lang.String url()
```
    Gets the url.
    The starting URL to crawl.
    
    Returns:
    the url
  - limitToStartingHosts
```
public java.lang.Boolean limitToStartingHosts()
```
    Gets the limitToStartingHosts.
    When `true`, crawls of the specified URL are limited to the host part of the **url** field.
    
    Returns:
    the limitToStartingHosts
  - crawlSpeed
```
public java.lang.String crawlSpeed()
```
    Gets the crawlSpeed.
    The number of concurrent URLs to fetch. `gentle` means one URL is fetched at a time with a delay between each call. `normal` means as many as two URLs are fectched concurrently with a short delay between fetch calls. `aggressive` means that up to ten URLs are fetched concurrently with a short delay between fetch calls.
    
    Returns:
    the crawlSpeed
  - allowUntrustedCertificate
```
public java.lang.Boolean allowUntrustedCertificate()
```
    Gets the allowUntrustedCertificate.
    When `true`, allows the crawl to interact with HTTPS sites with SSL certificates with untrusted signers.
    
    Returns:
    the allowUntrustedCertificate
  - maximumHops
```
public java.lang.Long maximumHops()
```
    Gets the maximumHops.
    The maximum number of hops to make from the initial URL. When a page is crawled each link on that page will also be crawled if it is within the **maximum_hops** from the initial URL. The first page crawled is 0 hops, each link crawled from the first page is 1 hop, each link crawled from those pages is 2 hops, and so on.
    
    Returns:
    the maximumHops
  - requestTimeout
```
public java.lang.Long requestTimeout()
```
    Gets the requestTimeout.
    The maximum milliseconds to wait for a response from the web server.
    
    Returns:
    the requestTimeout
  - overrideRobotsTxt
```
public java.lang.Boolean overrideRobotsTxt()
```
    Gets the overrideRobotsTxt.
    When `true`, the crawler will ignore any `robots.txt` encountered by the crawler. This should only ever be done when crawling a web site the user owns. This must be be set to `true` when a **gateway_id** is specied in the **credentials**.
    
    Returns:
    the overrideRobotsTxt
  - blacklist
```
public java.util.List<java.lang.String> blacklist()
```
    Gets the blacklist.
    Array of URL's to be excluded while crawling. The crawler will not follow links which contains this string. For example, listing `https://ibm.com/watson` also excludes `https://ibm.com/watson/discovery`.
    
    Returns:
    the blacklist

Class SourceOptionsWebCrawl

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class com.ibm.cloud.sdk.core.service.model.GenericModel

Methods inherited from class java.lang.Object

Field Detail

url

limitToStartingHosts

crawlSpeed

allowUntrustedCertificate

maximumHops

requestTimeout

overrideRobotsTxt

blacklist

Constructor Detail

SourceOptionsWebCrawl

Method Detail

newBuilder

url

limitToStartingHosts

crawlSpeed

allowUntrustedCertificate

maximumHops

requestTimeout

overrideRobotsTxt

blacklist