Operations Output Framework
  • 09 Aug 2022
  • 4 Minutes to read
  • Dark
    Light

Operations Output Framework

  • Dark
    Light

The Operations Output Framework enables data forwarding from Graylog clusters to external systems through a variety of network transport methods and payload formats. In addition, framework-based outputs can be configured to use Processing Pipelines to filter, modify, and enrich the outbound messages.

Note

This is an Operations Integrations feature and is only available since Graylog version 3.3.3; therefore, an Operations license is required. See the Integrations Setup page for more information.

About the Framework

The Operations Output Framework provides several new outputs for various network transport types. All of these outputs first write messages to an on-disk journal in the Graylog cluster. Messages stay in the on-disk journal until the output can successfully send the data to the external receiver.

enterprise

Once the messages have been written to the journal, you can run them through a processing pipeline to modify or enrich logs with additional data, transform the message contents, or filter out any logs before sending.

The processing pipeline converts the output payload to the desired format and then sends it using the selected transport message.

Messages arrive at the Output Framework once they are done processing in the Graylog source cluster and, simultaneously, the data is written to Elasticsearch.

On-Disk Journal

The Output Framework is equipped with an on-disk journal, which immediately persists messages received from the Graylog Output system to the disk, and then sends the messages to the external receiver. The Output Framework continually receives and reliably queues messages, even if the external receiver is temporarily unavailable due to network issues. The on-disk journal configuration options are described below.

The journal data is then stored in the directory controlled by the data_dir value in the Graylog configuration file. Journal data for Framework Outputs is stored in <data_dir>/stream_output/<OutputID>. As with the Output Base Path and the Input Journal, Output Framework uses a separate partition for journals to ensure journal growth does not impact overall system performance.

Note

Maximum Journal Size is a soft-limit configuration for Operations Outputs; the on-disk journal may grow larger. To guarantee journal data is cleaned up in a timely fashion, adjust the Maximum Journal Message Age and Journal Segment Age configuration values. Even unsent messages in the journal are purged once they are older than the Maximum Journal Message Age.

Pipeline Integration

When creating or editing a framework-based output, you can select a processing pipeline, which is executed on each message coming from the source stream. Use this pipeline to filter out messages that you do not wish to forward, and to add data to modify the contents of the outgoing message or enrich it with additional data.

Outbound Payload Formatting

Before sending data out over the wire, Graylog formats the outgoing payload. Payload formatting options include:

  • JSON Formatter

    • The Output Framework will convert the message’s key-value pairs into a JSON object.
  • Pipeline-Generated

    • The Output Framework will expect the pipeline to generate the outgoing payload and store it in the pipeline_output field of the message, which can be accomplished in the pipeline by using the set_field built-in function
  • Full Message

    • Some inputs support storage of the full received message in the full_message field. When this output formatter is selected, the contents full_message is used as the payload of the outgoing message. Messages without a full_message field or messages where the field is empty are ignored. The Full Message formatter is available in Graylog version 4.0.3 and above.
  • No-op Formatter

    • Generates payload from the message. The No-op Formatter is only intended for use with the Google Cloud BigQuery output. If used with any other Output, the Output payloads will be empty.

Framework Outputs

  • Operations TCP Raw/Plaintext Output

    • Sends messages as UTF-8 encoded plain text to the configured TCP endpoint (IP address and port).
  • Operations TCP Syslog Output

    • Sends formatted messages as the MSG portion of a standard Syslog message per section 6.4 of the Syslog specification. The Syslog message is sent to the configured TCP endpoint (IP address and port).
  • Operations Google Cloud BigQuery Output

    • The Output Framework converts the message’s key-value pairs into a new row for insertion into the specified Google BigQuery table.
  • Operations STDOUT Output

    • Displays formatted messages on the system’s console. This is primarily included as a debugging tool for pipeline changes.

Output Configuration

The Operations Output Framework can process messages at very high throughput rates. Throughput is affected by many hardware factors, such as CPU clock speed, number of CPU cores, available memory, and network bandwidth. Several Output Framework configuration options are available to help tune performance for throughput requirements and environments.

Common Configuration

  • Title

    • The name of the output.
  • Send Buffer Size

    • The number of messages the output can hold in its buffer while waiting to be written to the Journal.
  • Concurrent message processing pipelines

    • The number of pipeline instances that are allowed to run at any given time.
    • If set to 0, pipeline execution is skipped, even if a pipeline is selected from the pipeline dropdown.
  • Concurrent output payload formatters

    • The number of formatter instances that are allowed to run at any given time.
    • If this is set to 0, the output will fail.
  • Concurrent message senders

    • The number of sender instances that are allowed to run at any given time.
    • If this is set to 0, the output will fail.
  • Journal Segment Size

    • The journal segment file soft maximum size.
  • Journal Segment Age

    • The maximum amount of time journal segments areretained if there is storage to do so.
  • Maximum Journal Size

    • The maximum size of the journal.
  • Maximum Journal Message Age

    • The maximum time that a message is stored in the disk journal.
  • Journal Buffer Size

    • Memory buffer size for messages waiting to be written to the journal.
    • This value must be a power of two.
  • Journal Buffer Encoders

    • The number of concurrent encoders for messages being written to the journal.
  • Output Processing Pipeline

    • The pipeline that processes all messages sent to the output.
  • Outbound Payload Format

    • The format used for outgoing message payloads.

Was this article helpful?