The main configuration file, harvest.yml, consists of the following sections, described below:
Pollers¶
All pollers are defined in harvest.yml, the main configuration file of Harvest, under the section Pollers.
| parameter | type | description | default | 
|---|---|---|---|
| Poller name (header) | required | Poller name, user-defined value | |
| datacenter | required | Datacenter name, user-defined value | |
| addr | required by some collectors | IPv4, IPv6 or FQDN of the target system | |
| collectors | required | List of collectors to run for this poller | |
| exporters | required | List of exporter names from the Exporterssection. Note: this should be the name of the exporter (e.g.prometheus1), not the value of theexporterkey (e.g.Prometheus) | |
| auth_style | required by Zapi* collectors | Either basic_authorcertificate_authSee authentication for details | basic_auth | 
| username,password | required if auth_styleisbasic_auth | ||
| ssl_cert,ssl_key | optional if auth_styleiscertificate_auth | Paths to SSL (client) certificate and key used to authenticate with the target system. If not provided, the poller will look for <hostname>.keyand<hostname>.pemin$HARVEST_HOME/cert/.To create certificates for ONTAP systems, see using certificate authentication | |
| ca_cert | optional if auth_styleiscertificate_auth | Path to file that contains PEM encoded certificates. Harvest will append these certificates to the system-wide set of root certificate authorities (CA). If not provided, the OS's root CAs will be used. To create certificates for ONTAP systems, see using certificate authentication | |
| use_insecure_tls | optional, bool | If true, disable TLS verification when connecting to ONTAP cluster | false | 
| credentials_file | optional, string | Path to a yaml file that contains cluster credentials. The file should have the same shape as harvest.yml. See here for examples. Path can be relative toharvest.ymlor absolute. | |
| credentials_script | optional, section | Section that defines how Harvest should fetch credentials via external script. See here for details. | |
| tls_min_version | optional, string | Minimum TLS version to use when connecting to ONTAP cluster: One of tls10, tls11, tls12 or tls13 | Platform decides | 
| labels | optional, list of key-value pairs | Each of the key-value pairs will be added to a poller's metrics. Details below | |
| log_max_bytes | Maximum size of the log file before it will be rotated | 10 MB | |
| log_max_files | Number of rotated log files to keep | 5 | |
| log | optional, list of collector names | Matching collectors log their ZAPI request/response | |
| prefer_zapi | optional, bool | Use the ZAPI API if the cluster supports it, otherwise allow Harvest to choose REST or ZAPI, whichever is appropriate to the ONTAP version. See rest-strategy for details. | |
| conf_path | optional, :separated list of directories | The search path Harvest uses to load its templates. Harvest walks each directory in order, stopping at the first one that contains the desired template. | conf | 
| recorder | optional, section | Section that determines if Harvest should record or replay HTTP requests. See here for details. | 
Defaults¶
This section is optional. If there are parameters identical for all your pollers (e.g., datacenter, authentication method, login preferences), they can be grouped under this section. The poller section will be checked first, and if the values aren't found there, the defaults will be consulted.
Exporters¶
All exporters need two types of parameters:
- exporter parameters- defined in- harvest.ymlunder- Exporterssection
- export_options- these options are defined in the- Matrixdata structure emitted from collectors and plugins
The following two parameters are required for all exporters:
| parameter | type | description | default | 
|---|---|---|---|
| Exporter name (header) | required | Name of the exporter instance, this is a user-defined value | |
| exporter | required | Name of the exporter class (e.g. Prometheus, InfluxDB, Http) - these can be found under the cmd/exporters/directory | 
Note: when we talk about the Prometheus Exporter or InfluxDB Exporter, we mean the Harvest modules that send the data to a database, NOT the names used to refer to the actual databases.
Prometheus Exporter¶
InfluxDB Exporter¶
Tools¶
This section is optional. You can uncomment the grafana_api_token key and add your Grafana API token so harvest does
not prompt you for the key when importing dashboards.
Tools:
  #grafana_api_token: 'aaa-bbb-ccc-ddd'
Poller_files¶
Harvest supports loading pollers from multiple files specified in the Poller_files section of your harvest.yml file.
For example, the following snippet tells harvest to load pollers from all the *.yml files under the configs directory, 
and from the path/to/single.yml file.
Paths may be relative or absolute.
Poller_files:
    - configs/*.yml
    - path/to/single.yml
Pollers:
    u2:
        datacenter: dc-1
Each referenced file can contain one or more unique pollers.
Ensure that you include the top-level Pollers section in these files.
All other top-level sections will be ignored.
For example:
# contents of configs/00-rtp.yml
Pollers:
  ntap3:
    datacenter: rtp
  ntap4:
    datacenter: rtp
---
# contents of configs/01-rtp.yml
Pollers:
  ntap5:
    datacenter: blr
---
# contents of path/to/single.yml
Pollers:
  ntap1:
    datacenter: dc-1
  ntap2:
    datacenter: dc-1
At runtime, all files will be read and combined into a single configuration. The example above would result in the following set of pollers in this order.
- u2
- ntap3
- ntap4
- ntap5
- ntap1
- ntap2
When using glob patterns, the list of matching paths will be sorted before they are read. Errors will be logged for all duplicate pollers and Harvest will refuse to start.
Configuring collectors¶
Collectors are configured by their own configuration files (templates), which are stored in subdirectories
in conf/.
Most collectors run concurrently and collect a subset of related metrics.
For example, node related metrics are grouped together and run independently of the disk-related metrics.
Below is a snippet from conf/zapi/default.yaml
In this example, the default.yaml template contains a list of objects (e.g., Node) that reference sub-templates (e.g.,
node.yaml). This decomposition groups related metrics together and at runtime, a Zapi collector per object will be
created and each of these collectors will run concurrently.
Using the snippet below, we expect there to be four Zapi collectors running, each with a different subtemplate and
object.
collector:          Zapi
objects:
  Node:             node.yaml
  Aggregate:        aggr.yaml
  Volume:           volume.yaml
  SnapMirror:       snapmirror.yaml
At start-up, Harvest looks for two files (default.yaml and custom.yaml) in the conf directory of the
collector (e.g. conf/zapi/default.yaml).
The default.yaml is installed by default, while the custom.yaml is an optional file
you can create
to add new templates.
When present, the custom.yaml file will be merged with the default.yaml file.
This behavior can be overridden in your harvest.yml, see
here for an example.
For a list of collector-specific parameters, refer to their individual documentation.
Zapi and ZapiPerf¶
Rest and RestPerf¶
EMS¶
StorageGRID¶
Unix¶
Labels¶
Labels offer a way to add additional key-value pairs to a poller's metrics. These allow you to tag a cluster's metrics in a cross-cutting fashion. Here's an example:
  cluster-03:
    datacenter: DC-01
    addr: 10.0.1.1
    labels:
      - org: meg       # add an org label with the value "meg"
      - ns:  rtp       # add a namespace label with the value "rtp"
These settings add two key-value pairs to each metric collected from cluster-03 like this:
node_vol_cifs_write_data{org="meg",ns="rtp",datacenter="DC-01",cluster="cluster-03",node="umeng-aff300-05"} 10
Keep in mind that each unique combination of key-value pairs increases the amount of stored data. Use them sparingly. See PrometheusNaming for details.
HTTP Recorder¶
When troubleshooting, it can be useful to record HTTP requests and responses to disk for later replay.
Harvest removes Authorization and Host headers from recorded requests and responses
to prevent sensitive information from being stored on disk.
The recorder section in the harvest.yml file allows you to configure the HTTP recorder.
| parameter | type | description | default | 
|---|---|---|---|
| path | string required | Path to a directory. Recorded requests and responses will be stored here. Replaying will read the requests and responses from this directory. | |
| mode | string required | recordorreplay | |
| keep_last | optional, int | When mode is record, the number of records to keep before overwriting | 60 | 
Authentication¶
When authenticating with ONTAP and StorageGRID clusters, Harvest supports both client certificates and basic authentication.
These methods of authentication are defined in the Pollers or Defaults section of your harvest.yml using one or more
of the following parameters.
| parameter | description | default | Link | 
|---|---|---|---|
| auth_sytle | One of basic_authorcertificate_authOptional when usingcredentials_fileorcredentials_script | basic_auth | link | 
| username | Username used for authenticating to the remote system | link | |
| password | Password used for authenticating to the remote system | link | |
| credentials_file | Relative or absolute path to a yaml file that contains cluster credentials | link | |
| credentials_script | External script Harvest executes to retrieve credentials | link | 
Precedence¶
When multiple authentication parameters are defined at the same time, Harvest tries each method listed below, in the following order, to resolve authentication requests. The first method that returns a non-empty password stops the search.
When these parameters exist in both the Pollers and Defaults section,
the Pollers section will be consulted before the Defaults.
| section | parameter | 
|---|---|
| Pollers | auth_style: certificate_auth | 
| Pollers | auth_style: basic_authwith username and password | 
| Pollers | credentials_script | 
| Pollers | credentials_file | 
| Defaults | auth_style: certificate_auth | 
| Defaults | auth_style: basic_authwith username and password | 
| Defaults | credentials_script | 
| Defaults | credentials_file | 
Credentials File¶
If you would rather not list cluster credentials in your harvest.yml, you can use the credentials_file section
in your harvest.yml to point to a file that contains the credentials.
At runtime, the credentials_file will be read and the included credentials will be used to authenticate with the
matching cluster(s).
This is handy when integrating with 3rd party credential stores. See #884 for examples.
The format of the credentials_file is similar to harvest.yml and can contain multiple cluster credentials.
Example:
Snippet from harvest.yml:
Pollers:
  cluster1:
    addr: 10.193.48.11
    credentials_file: secrets/cluster1.yml
    exporters:
      - prom1 
File secrets/cluster1.yml:
Pollers:
  cluster1:
    username: harvest
    password: foo
Credentials Script¶
The credentials_script feature allows you to fetch authentication information via an external script. This can be configured in the Pollers section of your harvest.yml file, as shown in the example below.
At runtime, Harvest will invoke the script specified in the credentials_script path section. Harvest will call the script with one or two arguments depending on how your poller is configured in the harvest.yml file. The script will be called like this: ./script $addr or ./script $addr $username.
- The first argument $addris the address of the cluster taken from theaddrfield under thePollerssection of yourharvest.ymlfile.
- The second argument $usernameis the username for the cluster taken from theusernamefield under thePollerssection of yourharvest.ymlfile. If yourharvest.ymldoes not include a username, nothing will be passed.
The script should communicate the credentials to Harvest by writing the response to its standard output (stdout). Harvest supports two output formats from the script: YAML and plain text.
YAML format¶
If the script outputs a YAML object with username and password keys, Harvest will use both the username and password from the output. For example, if the script writes the following, Harvest will use myuser and mypassword for the poller's credentials.
   
username: myuser
password: mypassword
password is provided, Harvest will use the username from the harvest.yml file, if available. If your username or password contains spaces, #, or other characters with special meaning in YAML, make sure you quote the value like so:
   password: "my password with spaces"
If the script outputs a YAML object containing an authToken, Harvest will use this authToken when communicating with ONTAP or StorageGRID clusters. Harvest will include the authToken in the HTTP request's authorization header using the Bearer authentication scheme.
   
authToken: eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJEcEVkRmgyODlaTXpYR25OekFvaWhTZ0FaUnBtVlVZSDJ3R3dXb0VIWVE0In0.eyJleHAiOjE3MjE4Mj
authToken, the username and password fields are ignored if they are present in the script's output.
Plain text format¶
If the script outputs plain text, Harvest will use the output as the password. The username will be taken from the harvest.yml file, if available.  For example, if the script writes the following to its stdout, Harvest will use the username defined in that poller's section of the harvest.yml and mypassword for the poller's credentials.
   
mypassword
If the script doesn't finish within the specified timeout, Harvest will terminate the script and any spawned processes.
Credential scripts are defined under the credentials_script section within Pollers in your harvest.yml. Below are the options for the credentials_script section:
| parameter | type | description | default | 
|---|---|---|---|
| path | string | Absolute path to the script that takes two arguments: addrandusername, in that order. | |
| schedule | go duration or always | Schedule for calling the authentication script. If set to always, the script is called every time a password is requested; otherwise, the previously cached value is used. | 24h | 
| timeout | go duration | Maximum time Harvest will wait for the script to finish before terminating it and its descendants. | 10s | 
Example¶
Here is an example of how to configure the credentials_script in the harvest.yml file:
Pollers:
  ontap1:
    datacenter: rtp
    addr: 10.1.1.1
    username: admin # Optional: if not provided, the script must return the username
    collectors:
      - Rest
      - RestPerf
    credentials_script:
      path: ./get_credentials
      schedule: 3h
      timeout: 10s
In this example, the get_credentials script should be located in the same directory as the harvest.yml file and should be executable. It should output the credentials in either YAML or plain text format. Here are three example scripts:
get_credentials that outputs username and password in YAML format:
#!/bin/bash
cat << EOF
username: myuser
password: mypassword
EOF
get_credentials that outputs authToken in YAML format:
#!/bin/bash
# script requests an access token from the authorization server
# authorization returns an access token to the script
# script writes the YAML formatted authToken like so:
cat << EOF
authToken: $authToken
EOF
Below are a couple of OAuth2 credential script examples for authenticating with ONTAP or StorageGRID OAuth2-enabled clusters.
These are examples that you will need to adapt to your environment.
Example OAuth2 script authenticating with the Keycloak auth provider via curl. Uses jq to extract the token. This script outputs the authToken in YAML format.
#!/bin/bash
response=$(curl --silent "http://{KEYCLOAK_IP:PORT}/realms/{REALM_NAME}/protocol/openid-connect/token" \
  --header "Content-Type: application/x-www-form-urlencoded" \
  --data-urlencode "grant_type=password" \
  --data-urlencode "username={USERNAME}" \
  --data-urlencode "password={PASSWORD}" \
  --data-urlencode "client_id={CLIENT_ID}" \
  --data-urlencode "client_secret={CLIENT_SECRET}")
access_token=$(echo "$response" | jq -r '.access_token')
cat << EOF
authToken: $access_token
EOF
Example OAuth2 script authenticating with the Auth0 auth provider via curl. Uses jq to extract the token. This script outputs the authToken in YAML format.
#!/bin/bash
response=$(curl --silent https://{AUTH0_TENANT_URL}/oauth/token \
  --header 'content-type: application/json' \
  --data '{"client_id":"{CLIENT_ID}","client_secret":"{CLIENT_SECRET}","audience":"{ONTAP_CLUSTER_IP}","grant_type":"client_credentials"')
access_token=$(echo "$response" | jq -r '.access_token')
cat << EOF
authToken: $access_token
EOF
get_credentials that outputs only the password in plain text format:
#!/bin/bash
echo "mypassword"
Troubleshooting¶
- Make sure your script is executable
- If running the poller from a container, ensure that you have mounted the script so that it is available inside the container and that you have updated the path in the harvest.ymlfile to reflect the path inside the container.
- If running the poller from a container, ensure that your shebang references an interpreter that exists inside the container. Harvest containers are built from Distroless images, so you may need to use #!/busybox/sh.
- Ensure the user/group that executes your poller also has read and execute permissions on the script. 
  One way to test this is to suto the user/group that runs Harvest and ensure that thesu-ed user/group can execute the script too.
- When your script outputs YAML, make sure it is valid YAML. You can use YAML Lint to check your output.
- When your script outputs YAML and you want to include debug logging, make sure to redirect the debug output to stderrinstead ofstdout, or write the debug output as YAML comments prefixed with#.