It is important to consider the name of the new Elasticsearch cluster. In most cases, in fact, the default name should be changed to avoid conflicts. This guide doesn’t provide a step-by-step tutorial for building a multi-node Graylog cluster but does give some advice for questions that might arise during the setup.

It’s important for such a project that you understand each step in the setup process and do some planning upfront. Without a proper roadmap of all the things you want to achieve with a Graylog cluster, you will be lost on the way.

Graylog should be the last component you install in this setup. Its dependencies, namely MongoDB and Elasticsearch, have to be up and running first.

Warning: This guide doesn’t include instructions for running a multi-node Graylog cluster in an untrusted network. We assume that the connection between the hosts is trusted and doesn’t have to be secured individually.

Prerequisites

Every server which is part of this setup should have the software requirements installed to run the targeted software. All software requirements can be found in the installation manual.

We highly recommend that the system time on all systems is kept in sync via NTP or a similar mechanism. DNS resolution must be working, too.

In order to simplify the installation process, the servers should have a working Internet connection.

MongoDB Replica Set

We recommend to deploy a MongoDB replica set.

MongoDB doesn’t have to run on dedicated servers for the workload generated by Graylog, but you should follow the recommendations given in the MongoDB documentation about architecture. Most important is that you have an odd number of MongoDB servers in the replica set.

In most setups, each Graylog server will also host an instance of MongoDB which is part of the same replica set and shares the data with all other nodes in the cluster.

Hint: To avoid unauthorized access to your MongoDB database, the MongoDB replica set should be setup with authentication.

The correct order of working steps should be as follows:

  1. Create the replica set (rs01)
  2. Create the database (graylog)
  3. Create a user account for accessing the database, which has the roles readWrite and dbAdmin.

If your MongoDB needs to be reachable over network you should set the IP with bind_ip in the configuration.

Elasticsearch Cluster

The Elasticsearch setup documentation should help you to install Elasticsearch with a robust base configuration.

It is important to consider the name of the new Elasticsearch cluster, in most cases the default name should be changed to avoid conflicts with Elasticsearch nodes using the default configuration. Just choose anything else (we recommend ), because this is the default name and any Elasticsearch instance that is started in the same network will try to connect to this cluster.

The Elasticsearch servers need one IP that can be reached over network set in network.host and some participants of the cluster in discovery.zen.ping.unicast.hosts. That is enough to have a minimal cluster setup.

When you secure your Elasticsearch with User Authentication you need to add credentials to the Graylog configuration to be able to use the secured Elasticsearch cluster with Graylog.

Graylog Multi-Node

After the installation of Graylog, you should take care that only one Graylog node is configured to be master with the configuration setting is_master = true.

The http_bind_address configured address needs to be reachable by all Graylog nodes in the cluster. The http_publish_uri is normally auto-generated from the http_bind_address. This URI is used for the inter-node communication.

If the http_bind_address is configured with 0.0.0.0 you must configure http_publish_uri. Otherwise Graylog will use the first non loopback IP, what might not fit into your desired design. All Graylog nodes need to reach all other Graylog nodes via their configured http_publish_uri for inter-node communication. If you use TLS in your Graylog configuration, this includes https as protocol.

Graylog to MongoDB Connection

The mongodb_uri configuration setting must include all MongoDB nodes forming the replica set, the name of the replica set, as well as the previously configured user account with access to the replica set. The configuration setting is a normal MongoDB connection string .

Finally, the MongoDB connection string in the Graylog configuration file should look like this:

Copy
mongodb_uri = mongodb://USERNAME:PASSWORD@mongodb-node01:27017,mongodb-node02:27017,mongodb-node03:27017/graylog?replicaSet=rs01

Graylog to Elasticsearch Connection

Graylog will connect to the Elasticsearch REST API .

To avoid issues with the connection to the Elasticsearch cluster you should add some of the network addresses of the Elasticsearch nodes to elasticsearch_hosts.

Graylog Web Interface

It’s possible to use a loadbalancer in front of all Graylog servers, please refer to Making the web interface work with load balancers/proxies for more details.

Depending on your setup, it’s possible to either use a hardware loadbalancer for TLS/HTTPS termination, a reverse proxy, or to simply enable it in the Graylog node.

Scaling

Each component in this multi-node setup can be scaled on the individual needs.

Depending on the amount of messages ingested and how long messages should be available for direct search, the Elasticsearch cluster will need most of the resources on your setup.

Keep an eye on the Metrics of each part of the cluster. One option is to use telegraf to fetch important metrics and store them in your favorite metric system (e. g. Graphite, Prometheus or Influx).

Elasticseach Metrics and some administration can be done with Elastic HQ or Cerebro. Those will help you to understand the Elasticsearch cluster health and behavior.

Graylog Metrics can be monitored with the Graylog Metrics Reporter plugins which are able to send the internal Graylog metrics to your favorite metrics collector (e. g. Graphite or Prometheus).

We have almost never faced the issue that the MongoDB replica set needed special attention. But of course you should still monitor it and store its metrics - just to be sure.

Troubleshooting

  • After every configuration change or service restart, watch the logfile of the applications you have worked on. Sometimes other log files can also give you hints about what went wrong.
  • For example if you’re configuring Graylog and try to find out why the connection to the MongoDB isn’t working, the MongoDB logs can help to identify the problem. If HTTPS has been enabled for the Graylog REST API, it need to be setup for the Graylog web interface, too.