cluster.io-pool.size |
(none) |
Integer |
The size of the IO executor pool used by the cluster to execute blocking IO operations (Master as well as TaskManager processes). By default it will use 4 * the number of CPU cores (hardware contexts) that the cluster process has access to. Increasing the pool size allows to run more IO operations concurrently. |
cluster.registration.error-delay |
10 s |
Duration |
The pause made after an registration attempt caused an exception (other than timeout). |
cluster.registration.initial-timeout |
100 ms |
Duration |
Initial registration timeout between cluster components. |
cluster.registration.max-timeout |
30 s |
Duration |
Maximum registration timeout between cluster components. |
cluster.registration.refused-registration-delay |
30 s |
Duration |
The pause made after the registration attempt was refused. |
cluster.services.shutdown-timeout |
30 s |
Duration |
The shutdown timeout for cluster services like executors. |
heartbeat.interval |
10 s |
Duration |
Time interval between heartbeat RPC requests from the sender to the receiver side. |
heartbeat.rpc-failure-threshold |
2 |
Integer |
The number of consecutive failed heartbeat RPCs until a heartbeat target is marked as unreachable. Failed heartbeat RPCs can be used to detect dead targets faster because they no longer receive the RPCs. The detection time is heartbeat.interval * heartbeat.rpc-failure-threshold . In environments with a flaky network, setting this value too low can produce false positives. In this case, we recommend to increase this value, but not higher than heartbeat.timeout / heartbeat.interval . The mechanism can be disabled by setting this option to -1 |
heartbeat.timeout |
50 s |
Duration |
Timeout for requesting and receiving heartbeats for both sender and receiver sides. |
jobmanager.execution.failover-strategy |
"region" |
String |
This option specifies how the job computation recovers from task failures. Accepted values are:- 'full': Restarts all tasks to recover the job.
- 'region': Restarts all tasks that could be affected by the task failure. More details can be found here.
|