hdfs
Sends message parts as files to a HDFS directory.
- Common
- Advanced
# Common config fields, showing default valuesoutput:label: ""hdfs:hosts: []user: ""directory: ""path: ${!count("files")}-${!timestamp_unix_nano()}.txtmax_in_flight: 1batching:count: 0byte_size: 0period: ""check: ""
# All config fields, showing default valuesoutput:label: ""hdfs:hosts: []user: ""directory: ""path: ${!count("files")}-${!timestamp_unix_nano()}.txtmax_in_flight: 1batching:count: 0byte_size: 0period: ""check: ""processors: []
Each file is written with the path specified with the 'path' field, in order to have a different path for each object you should use function interpolations described here.
Performance​
This output benefits from sending multiple messages in flight in parallel for
improved performance. You can tune the max number of in flight messages with the
field max_in_flight.
Fields​
hosts​
A list of hosts to connect to.
Type: array
Default: []
# Exampleshosts: localhost:9000
user​
A user identifier.
Type: string
Default: ""
directory​
A directory to store message files within. If the directory does not exist it will be created.
Type: string
Default: ""
path​
The path to upload messages as, interpolation functions should be used in order to generate unique file paths. This field supports interpolation functions.
Type: string
Default: "${!count(\"files\")}-${!timestamp_unix_nano()}.txt"
# Examplespath: ${!count("files")}-${!timestamp_unix_nano()}.txt
max_in_flight​
The maximum number of messages to have in flight at a given time. Increase this to improve throughput.
Type: int
Default: 1
batching​
Allows you to configure a batching policy.
Type: object
# Examplesbatching:byte_size: 5000count: 0period: 1sbatching:count: 10period: 1sbatching:check: this.contains("END BATCH")count: 0period: 1m
batching.count​
A number of messages at which the batch should be flushed. If 0 disables count based batching.
Type: int
Default: 0
batching.byte_size​
An amount of bytes at which the batch should be flushed. If 0 disables size based batching.
Type: int
Default: 0
batching.period​
A period in which an incomplete batch should be flushed regardless of its size.
Type: string
Default: ""
# Examplesperiod: 1speriod: 1mperiod: 500ms
batching.check​
A Bloblang query that should return a boolean value indicating whether a message should end a batch.
Type: string
Default: ""
# Examplescheck: this.type == "end_of_transaction"
batching.processors​
A list of processors to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op.
Type: array
Default: []
# Examplesprocessors:- archive:format: linesprocessors:- archive:format: json_arrayprocessors:- merge_json: {}