Elasticsearch as log collector

We use a cluster of three dedicated elasticsearch VMs to collect logs fro mand stats about all other VMs.

Setup

Setup the VM

The setup process is largly automated to the point where all software is installed and a first round of updates is applied. The elasticsearch instances are not started as they need to be brought up in a coordinated way to create a cluster.

If you redo the procedure below, that is: if you boot from custom-boot, then the VM will be wiped and set up with the initial configuration. The ISOs are attached to the VMs so simply forcing the EFI menu on startup and selecting the first optical drive to boot from will reset the node.

We used the RockyLinux 9 minimal ISO image and a customized boot ISO image to initally configure the network interfaces and fetch the actual install information. Extract the minimal imagge and copy EFI, images and isolinux to new-iso Change the Install Rocky Linux 9.3 entry to new-iso/EFI/BOOT/grub.cfg

<code class="grub">[...]
menuentry 'Install Rocky Linux 9.3 on logsearch2' --class fedora --class gnu-linux --class gnu --class os {
	linuxefi /images/pxeboot/vmlinuz inst.stage2=hd:LABEL=Rocky-9-3-x86_64-dvd quiet \
             ip=10.3.6.92::10.3.6.1:255.255.255.0:logsearch1.servants.priv:ens224:none \
             inst.ks=http://10.3.6.50:8088/logsearch2.ks
	initrdefi /images/pxeboot/initrd.img
}
[...]

Change the IP address and host name as needed.

<code class="powershell">mkisofs -U -A Rocky-9-3-x86_64 -V Rocky-9-3-x86_64-custom-boot -volset Rocky-9-3-x86_64 `
   -J -joliet-long -r -v -T `
   -o Rocky-9.3-x86_64-custom-boot.iso `
   -b isolinux/isolinux.bin -c isolinux/boot.cat -no-emul-boot -boot-load-size 4 `
   -boot-info-table -eltorito-alt-boot -eltorito-platform 0xEF -eltorito-boot images/efiboot.img -no-emul-boot new-iso

Create a standard RockyLinux VM with 8 vCPU, 8GB of RAM and 200GB of storage. Add 2 optical drives and attach the custom boot image and the RockyLinux 9.3 minimal image. Boot from the first optical drive. The kickstarter files are split into the machine/network specific part and the logsearch-common setup. They can be found at kubenodconfig`racnher (10.3.6.50).

Set up the elasticsearch cluster

The first machine that is set up will be set up a little different then the others.

First node

It will initially form a single node cluster that will then be extended with the other machines.

Create the following elasticsearch.yaml and copy it to /etc/elasticsearch or edit it there directly. You need to copy this file to other nodes so it may be better to create it as logsearch user.

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: logsearch
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: logsearch1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /var/lib/elasticsearch
#
# Path to log files:
#
path.logs: /var/log/elasticsearch
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# By default Elasticsearch is only accessible on localhost. Set a different
# address here to expose this node on the network:
#
network.host: 10.3.6.91
network.bind_host: 10.6.16.91
network.publish_host: 10.6.16.91
#
# By default Elasticsearch listens for HTTP traffic on the first free port it
# finds starting at 9200. Set a specific HTTP port here:
#
#http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.seed_hosts: ["10.6.16.92", "10.6.16.93"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
cluster.initial_master_nodes: ["logsearch1"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Allow wildcard deletion of indices:
#
#action.destructive_requires_name: false

Starting the service will generate certificates and add a BEGIN SECURITY AUTO CONFIGURATION to this yaml.

sudo systemctl enabel --now elasticsearch

Join the other two nodes

Before you run the join command edit /etc/elasticsearch/elasticsearch.yml and set the values as above but change the IP adresses and hostnames as needed.

# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: logsearch
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: logsearch2
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /var/lib/elasticsearch
#
# Path to log files:
#
path.logs: /var/log/elasticsearch
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# By default Elasticsearch is only accessible on localhost. Set a different
# address here to expose this node on the network:
#
network.host: 10.3.6.92
network.bind_host: 10.6.16.92
network.publish_host: 10.6.16.92

Do not set discovery.seed_hosts or cluster.initial_master_nodes You need to set a little bit of config first and then you can follow the procedure described in the elastic documentation and use its automatic.

Set discovery.seed_hosts

To be able to start automatically if any two other nodes are still running the discovery.seed_hosts needs to be set to the other hosts. 10.3.6.91 → discovery.seed_hosts: ["10.6.16.92:9300", "10.6.16.93:9300"] 10.3.6.92 → discovery.seed_hosts: ["10.6.16.91:9300", "10.6.16.93:9300"] 10.3.6.93 → discovery.seed_hosts: ["10.6.16.91:9300", "10.6.16.92:9300"]

(Re)set the elastic superuser password

By default an elasticsearch cluster started with systemd will not have a password for the elastic superuser so it will not let you do anything useful. This is mentioned in the logs:Auto-configuration will not generate a password for the elastic built-in superuser, as we cannot determine if there is a terminal attached to the elasticsearch process. You can use the bin/elasticsearch-reset-password tool to set the password for the elastic user. Set the elastic password:

sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic

Setup kibana as the UI

The UI can run anywhere it can connect to the REST API elasticsearch endpoints at 10.6.16.91:9200, 10.6.16.92:9200 and 10.6.16.93:9200. Because of this we use the docker setup method but using our kubernetes cluster:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    workload.user.cattle.io/workloadselector: apps.deployment-logsearch-kibana
  name: kibana
  namespace: logsearch
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      workload.user.cattle.io/workloadselector: apps.deployment-logsearch-kibana
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        workload.user.cattle.io/workloadselector: apps.deployment-logsearch-kibana
      namespace: logsearch
    spec:
      containers:
        - env:
            - name: SERVER_NAME
              value: logsearch.cluster.machine-deck.jeffries-tube.at
            - name: ELASTICSEARCH_HOSTS
              value: >-
                ["https://10.6.16.91:9200","https://10.6.16.92:9200","https://10.6.16.93:9200"]
          image: docker.elastic.co/kibana/kibana:8.12.0
          imagePullPolicy: Always
          name: kibana
          ports:
            - containerPort: 5601
              name: http
              protocol: TCP
          resources: {}
          securityContext:
            allowPrivilegeEscalation: false
            privileged: false
            readOnlyRootFilesystem: false
            runAsNonRoot: false
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
            - mountPath: /usr/share/kibana/data
              name: logsearch-kibana-data
            - mountPath: /usr/share/kibana/config
              name: logsearch-kibana-config
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
        - name: logsearch-kibana-data
          persistentVolumeClaim:
            claimName: logsearch-kibana-data
        - name: logsearch-kibana-config
          persistentVolumeClaim:
            claimName: logsearch-kibana-config

Also a service container is needed to

set the access rights for /usr/share/kibana/config and /usr/share/kibana/data
create at least an empty /usr/share/kibana/config/kibana.yml so the kibana container does not crash immediately

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    workload.user.cattle.io/workloadselector: apps.deployment-logsearch-service
  namespace: logsearch
  name: service
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      workload.user.cattle.io/workloadselector: apps.deployment-logsearch-service
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        workload.user.cattle.io/workloadselector: apps.deployment-logsearch-service
      namespace: logsearch
    spec:
      affinity: {}
      containers:
        - image: quay.io/toolbx-images/ubuntu-toolbox:22.04
          imagePullPolicy: Always
          name: service
          resources: {}
          securityContext:
            allowPrivilegeEscalation: false
            privileged: false
            readOnlyRootFilesystem: false
            runAsNonRoot: false
          stdin: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          tty: true
          volumeMounts:
            - mountPath: /kibana/config
              name: config
            - mountPath: /kibana/data
              name: data
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
        - name: config
          persistentVolumeClaim:
            claimName: logsearch-kibana-config
        - name: data
          persistentVolumeClaim:
            claimName: logsearch-kibana-data

So within that container run:

chown 1000:0 /kibana/config
chown 1000:0 /kibana/data
touch /kibana/config/kibana.yml
chown 1000:0 /kibana/config/kibana.yml

Generate a kibana enrollment token on the elasticsearch nodes:

sudo /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana

Then within the kibana container:

bin/kibana-setup

This just asks for the enrollment token we just genereated. The generated kibana.yml is functional but rudimentary. A more complete setup looks like this:

server:
  host: "0.0.0.0"
  name: logsearch.cluster.machine-deck.jeffries-tube.at
  publicBaseUrl: https://logsearch.cluster.machine-deck.jeffries-tube.at
# This section was automatically generated during setup.
elasticsearch:
  hosts:
   - https://10.3.6.91:9200
   - https://10.3.6.92:9200
   - https://10.3.6.93:9200
  serviceAccountToken: <replace with data from kibana-setup>
  ssl:
    certificateAuthorities:
     - <replace with data from kibana-setup>
xpack:
  reporting:
    kibanaServer:
      hostname: logsearch.cluster.machine-deck.jeffries-tube.at
  fleet:
    outputs:
     - id: fleet-default-output
       name: default
       is_default: true
       is_default_monitoring: true
       type: elasticsearch
       hosts: 
        - https://10.3.6.91:9200
        - https://10.3.6.92:9200
        - https://10.3.6.93:9200
       ca_trusted_fingerprint: <replace with data from kibana-setup>

Without at least the server config this will not expose the UI outside the kibana container.

Projekte

Table of Contents