Navigating between cluster resources and observable signals.

Overview

Observable resources produce many kinds of signal - logs, metrics, traces, alerts and others. Each type of signal can have its own distinct data model, naming conventions, query language, and storage. This makes it difficult to navigate relationships between cluster resources and the signals they generate.

The OpenTelemetry project defines a standard vocabulary for describing several types of signal. Korrel8r works with OpenTelemetry but does not require it. See About OpenTelemetry and Korrel8r for more.

Korrel8r is a rule based correlation engine, with an extensible rule set, that can navigate
↳ many types of signal and resource data
↳ using diverse schema, data models and naming conventions
↳ queried using diverse query languages
↳ stored in multiple stores with diverse query APIs

Each type of signal or resource is represented by a domain . Korrel8r can be extended to handle new signals and resources by adding new domains.

Relationships within and between domains are expressed as rules.

About domains

A korrel8r domain represents one type of signal or resource. Available domains are described in the Domain Reference.

Each domain defines the following abstractions, allowing Korrel8r to treat domains uniformly:

Object

A data type representing an instance of the domain’s signal or resource, for example a log record or a serialized Kubernetes resource. Korrel8r does not distinguish between signals and resources, they are correlated in the same way.

Store

A service that can be queried for objects. A domain provides one or more types of store. Each store represents a storage service like the Kubernetes API server, Prometheus, Loki or others.

Class

A named class of objects represented by the same data structure. A class name is a string of the form domain:class. Many domains have only one class, for example trace:span. Others have multiple classes. The [_k8s] domain has a class for each kind of kubernetes resource, for example k8s:Pod and k8s:DaemonSet

Query

A string of the form domain:class:selector which selects a set of objects from a store. Korrel8r does not define a query language, selectors use the native query language of the store. For example, a [_trace] query ussing TraceQL:
trace:span:{.kubernetes.namespace.name="foobar"}

About rules

Rules express relationships between classes, possibly in different domains.

Rules are [templates] that generate a goal query using data from a start object. The start and goal can be different classes, possibly from different domains. If a rule cannot be applied to an object it may return a blank string, or raise an error.

The set of rules forms a graph connecting all the classes of data that korrel8r knows about. Korrel8r works by traversing this graph: applying rules to some initial objects, executing the resulting queries, retrieving more objects and so on.

There are two types of search:

Goal search

Given a set of start objects and a goal class: find all paths from the start objects to objects in the goal class.

Neighbourhood search

Given a set of start objects and a depth N: find all objects reachable from the start objects in N steps or less.

About OpenTelemetry and Korrel8r

The OpenTelemetry project (OTEL) defines a standard vocabulary for traces, metrics, and logs. The Korrel8r [_trace] domain implements OTEL, other OTEL domains may be added future.

As OTEL is adopted, it will provide a more consistent vocabulary for Korrel8r rules, but it will not replace Korrel8r:

  • There are still widely-deployed systems that do not implement OTEL. Korrel8r can bridge between OTEL and non-OTEL systems.

  • OTEL only does not provide mechanisms for describing or navigating relationships between signals and resources, this is outside its mandate.

  • OTEL does not standardize query languages or store APIs, even for OTEL-compliant systems.

  • Korrel8r describes signal types that are not covered by OTEL. For example k8s:Event or netflow:network events.

Running Korrel8r

Korrel8r usually runs as a cluster service with a REST API. It is used as a back-end for tools that display correlation results. For example, the Red Hat OpenShift console uses Korrel8r to implement its {troubleshoopting-panel}[troubleshooting panel].

It can also be run from the command line for development and testing.

See Configuration for details of configuring Korrel8r.

Korrel8r uses a Bearer Token to authenticate itself to stores.
When run as a server, it impersonates the client by copying the token from the incoming REST request.
On the command line it uses the current kubectl login, so you need to be logged in to the cluster.

In-cluster service

The Red Hat Cluster Observability Operator automatically installs and configures korrel8r on a Red Hat OpenShift cluster.

For testing and development you can also:

Deploy the latest from github.
oc apply -k github.com/korrel8r/korrel8r/config/?version=main
Deploy version X.Y.Z from github.
oc apply -k github.com/korrel8r/korrel8r/config/?version=vX.Y.Z
Use community operator from OperatorHub, look for

Korrel8r Community Tile

To connect to your service from outside the cluster, you can expose it with an 'ingress' or 'route'.

Create a korrel route on Red Hat OpenShift
oc apply -k github.com/korrel8r/korrel8r/config/route?version=main
HOST=$(oc get route/korrel8r -n korrel8r -o template={{.Spec.Host}}

You can use the Clients to test your service at $HOST.

Out-of-cluster service

Korrel8r can run outside the cluster, using routes or ingress URLs to connect to observability stores inside the cluster. This can be useful for development and testing.

Install korrel8r command
go install github.com/korrel8r/korrel8r/cmd/korrel8r@latest

Run korrel8r --help to get command line help.

Download configuration for a Red Hat OpenShift cluster.
curl -o korrel8r.yaml  https://raw.githubusercontent.com/korrel8r/korrel8r/main/etc/korrel8r/openshift-route.yaml
Run as a service outside the cluster on localhost port 8080, with informational logging.
korrel8r -v2 --config korrel8r.yaml web --http=localhost:8080

See Clients for ways to test your service at http://localhost:8080.

Clients

Once the Korrel8r service is running, there are several ways you can interact with the REST API.

You can also use specialized clients like the Red Hat Openshift troubleshooting panel, or build your own.

All of the clients require a Bearer Token to authenticate with the cluster.
If you are logged in to a Red Hat OpenShift cluster you can find your bearer token like this:

oc whoami -t

korrel8rcli

korrel8rcli is a simple command-line client of the Korrel8r service. It is intended for development and experimentation. In production korrel8r normally functions as a back-end for a console or other visualization or analysis tool.

Install korrel8rcli
go install github.com/korrel8r/client/cmd/korrel8rcli@latest

Run korrel8rcli --help to get command line help.

If you are logged on to a cluster, korrel8rcli will automatically use your bearer token. You can also specify a token explicitly with the --bearer-token option.

In the following examples, replace http://korrel8r.example with the URL for your korrel8r service.

Domains
Get the list of domains and store status.
korrel8rcli -u http://korrel8r.example domains
Example output
- name: alert
  stores:
  - alertmanager: https://alertmanager-main.openshift-monitoring.svc:9094
    certificateAuthority: ./run/secrets/kubernetes.io/serviceaccount/service-ca.crt
    domain: alert
    metrics: https://thanos-querier.openshift-monitoring.svc:9091
- name: k8s
  stores:
  - domain: k8s
- name: log
  stores:
  - certificateAuthority: ./run/secrets/kubernetes.io/serviceaccount/service-ca.crt
    domain: log
    lokiStack: https://logging-loki-gateway-http.openshift-logging.svc:8080
- name: metric
  stores:
  - certificateAuthority: ./run/secrets/kubernetes.io/serviceaccount/service-ca.crt
    domain: metric
    metric: https://thanos-querier.openshift-monitoring.svc:9091
- name: mock
  stores: null
- name: netflow
  stores:
  - certificateAuthority: ./run/secrets/kubernetes.io/serviceaccount/service-ca.crt
    domain: netflow
    lokiStack: https://loki-gateway-http.netobserv.svc:8080
Neighbours
Get the neighbours graph of the korreler deployment
korrel8rcli -o json neighbours http://korrel8r.example --query 'k8s:Deployment:{namespace: korrel8r}'
Output is a JSON-encoded graph.
{"edges":[{"goal":"k8s:Pod.v1.","start":"k8s:Deployment.v1.apps"},...
 "nodes":[{"class":"log:application","count":15,"queries":[...
Web browser
This tool is experimental, it may change or be dropped in future.
Run a web server to vizualize results.
korrel8rcli web -u http://korrel8r.example --addr :9090
View results in a browser.

Open localhost:9090 in a web browser. You can type a korrel8r query in the Start box, and a numeric depth (for neighbours) or korrel8r class name (for goal search). You should see a graph of the correlations, like this.

Screenshot
Figure 1. Example graph.

Interactive REST API

Korrel8r provides an interactive REST API page using swagger. This page documents the REST API, and allows you to make interactive calls.

Browse to http://korrel8r.example/swagger to use the page.

Using curl

You can use a generic REST client like curl to interact with korrel8r. You can use the Interactive REST API page to discover the right curl commands. When you execute a call interactively on this page, it will show you the corresponding curl command.

You must provide a bearer token, for example:

curl --oauth2-bearer $(oc whoami -t) http://localhost:8080/api/v1alpha1/domains

Red Hat Openshift troubleshooting panel

The Red Hat OpenShift console has a "troubleshooting panel" that is powered by Korrel8r. This is a good example of how korrel8r is intended to be used, as a back-end to compute correlations for other tools that visualize the results.

Troubleshooting

You can increase the verbosity of korrel8r logging at run time using the config API.

Using korrel8rcli
korrel8rcli config --set-verbose=9
Using curl
curl --oauth2-bearer $(oc whoami -t) -X PUT http://localhost:8080/api/v1alpha1/config?verbose=9

Configuration

Korrel8r loads configuration from a file or URL specified by the --config option:

korrel8r --config <file_or_url>

The Korrel8r project provides example configuration files. You can download them or use them directly via URL.

openshift-route.yaml

Run korrel8r outside the cluster, connect to stores via routes:

openshift-svc.yaml

Used to run korrel8r as an In-cluster service, connect to stores via service URLs.

The configuration is a YAML file with the following sections:

include

Other configuration fragments to include.
include:
  - "path_or_url"

stores

Connections to data stores.
stores:
  - domain: "domain_name" (1)
    # Domain-specific fields (2)
1 Domain name of the store (required).
2 Domain-specific fields for connection parameters. See Domain Reference.

Every entry in the stores section has a domain field to identify the domain. Other fields depend on the domain, see Domain Reference. Store fields can contain plain URL strings or templates that expand to URLs.

Example: configuring a store URL from an Openshift Route resource.
stores:
  - domain: log
    lokiStack: <-
      {{$r := get "k8s:Route:{namespace: openshift-logging, name: logging-loki}" -}} (1)
      https://{{ (first $r).Spec.Host -}} (2)
1 Get a list of routes in "openshift-logging" named "logging-loki".
2 Use the .Spec.Host field of the first route as the host for the store URL.

rules

Rules to relate different classes of data.
rules:
  - name: "rule_name" (1)
    start: (2)
      domain: "domain_name"
      classes:
        - "class_name"
    goal: (3)
      domain: "domain_name"
      classes:
        - "class_name"
    result:
      query: "query_template" (4)
1 Name identifies the rule in graphs and for debugging.
2 Start objects for this rule must belong to one of the classes in the domain.
3 Goal queries generated by this rule may must retrieve one of the classes in the domain.
4 Result queries are generated by executing the query {go-template}` with the start object as context.

Korrel8r comes with a comprehensive set of rules by default, but you can modify them or add your own.

A rule has the following key elements:

  • A set of start classes. The rule can apply to objects belonging to one of these classes.

  • A set of goal classes. The rule can generate queries for any of these classes.

  • A Go template to generate a goal query from a start object.

The query template should generate a string of the form:

<domain-name>:<class-name>:<query-details>

The query-details part depends on the domain, see Domain Reference

aliases

Short-hand alias names for groups of classes.
aliases:
  - name: "alias_name" (1)
    domain: "domain_name" (2)
    classes: (3)
      - "class_name"
1 Alias name can be used in rule definitions wherever a class name is allowed.
2 Domain for classes in this alias.
3 Classes belonging to this alias.

Writing Templates

Korrel8r rules and store configuration use Go templates[1].

Korrel8r provides some additional template functions to simplify writing rules and configurations:

  • The sprig library of general purpose template functions is always available.

  • Some domains (for example the k8s domain) provide domain-specific functions, see the Domain Reference.

  • The following function is available for store configurations:

    query

    Takes a single argument, a korrel8r query string. Executes the query and returns the result as a []any. May return an error.

Domain Reference

Reference details for the for the classes, objects, queries and stores of each available domain.

k8s domain

Domain k8s implements Kubernetes resources stored in a Kube API server.

Store

The k8s domain automatically connects to the current cluster (as determined by kubectl), no additional configuration is needed.

 stores:
	  domain: k8s

Template Functions

The following template functions are available to rules.

k8sClass
    Takes string arguments (apiVersion, kind).
    Returns the korrel8r.Class implied by the arguments, or an error.

Query

Query struct for a Kubernetes query.

Example:

k8s:Pod.v1.:{"namespace":"openshift-cluster-version","name":"cluster-version-operator-8d86bcb65-btlgn"}

See Go documentation for Query

Object

Object is a struct type representing a Kubernetes resource.

Object can be one of the of the standard k8s types from k8s.io/api/core, or a generated custom resource type.

Rules templates should use capitalized Go field names rather than the lowercase JSON field names.

See Go documentation for Object

alert domain

Domain alert provides Prometheus alerts, queries and access to Thanos and AlertManager stores.

Class

There is a single class alert:alert.

Object

An alert object is represented by this Go type. Rules starting from an alert should use the capitalized Go field names rather than the lowercase JSON names. Object

Query

A JSON map of string names to string values, matched against alert labels, for example:

alert:alert:{"alertname":"KubeStatefulSetReplicasMismatch","container":"kube-rbac-proxy-main","namespace":"openshift-logging"}

Store

A client of Prometheus and/or AlertManager. Store configuration:

domain: alert
metrics: PROMETHEUS_URL
alertmanager: ALERTMANAGER_URL

Either or both of metrics or alertmanager may be present.

Query

Query is a map of label name:value pairs for matching alerts, serialized as JSON.

See Go documentation for Query

Object

Object contains alert data, passed as *Object when used as a korrel8r.Object.

See Go documentation for Object

log domain

Domain log is a domain for openshift-logging ViaQ logs stored in Loki or LokiStack.

Class

There are 3 classes corresponding to the 3 openshift logging log types:

log:application
log:infrastructure
log:audit

Object

A log object is a JSON map\[string]any in ViaQ format.

Query

A query is a LogQL query string, prefixed by the logging class, for example:

log:infrastructure:{ kubernetes_namespace_name="openshift-cluster-version", kubernetes_pod_name=~".*-operator-.*" }

Store

To connect to a lokiStack store use this configuration:

domain: log
lokistack: URL_OF_LOKISTACK_PROXY

To connect to plain loki store use:

domain: log
loki: URL_OF_LOKI

Copyright: This file is part of korrel8r, released under https://github.com/korrel8r/korrel8r/blob/main/LICENSE

Template Functions

logTypeForNamespace
    Takes a namespace string argument.
    Returns the log type ("application" or "infrastructure") of a container in the namespace.

logSafeLabel
    Convert the string argument into a  safe label containing only alphanumerics '_' and ':'.

Query

Query is a LogQL query string

See Go documentation for Query

Object

Object is a map in Viaq format.

See Go documentation for Object

metric domain

package metric represents Prometheus metric time-series as objects.

Class

There is only one class: metric:metric

Object

A Metric is a time series identified by a label set. Korrel8r does not consider load the sample data for a time series, or use it in rules. If a korrel8r search time constraints, then metrics that have no values that meet the constraint are ignored.

Store

Prometheus is the store, store configuration:

domain: metric
metric: URL_OF_PROMETHEUS

Query

Query is a PromQL query string.

Korrel8r uses metric labels for correlation, it does not use time-series data values. The PromQL query is analyzed to identify series it uses, labels of those series are used for correlation.

See Go documentation for Query

Object

See Go documentation for Object

netflow domain

Domain netflow is a domain for network observability flow events stored in Loki or LokiStack.

Class

There is a single class netflow:network

Object

A log object is a JSON map\[string]any in NetFlow format.

Query

A query is a LogQL query string, prefixed by netflow:network:, for example:

netflow:network:{SrcK8S_Type="Pod", SrcK8S_Namespace="myNamespace"}

Store

To connect to a netflow lokiStack store use this configuration:

domain: netflow
lokistack: URL_OF_LOKISTACK_PROXY

To connect to plain loki store use:

domain: netflow
loki: URL_OF_LOKI

Query

Query is a LogQL query string

See Go documentation for Query

Object

Object is a map holding netflow entries

See Go documentation for Object

trace domain

Domain trace implements OpenTelemetry traces stored in the Grafana Tempo data store.

Store

The trace domain accepts an optional "tempostack" field with a URL for tempostack. If absent, connect to the default location for the trace store on an Openshift cluster.

stores:
  domain: trace
  tempostack: "https://url-of-tempostack"

Query

Query selector has two forms: a TraceQL query string, or a list of trace IDs.

A TraceQL query selects spans from many traces that match the query criteria. Example:

`trace:span:{resource.kubernetes.namespace.name="korrel8r"}`

A trace-id query is a list of hexadecimal trace IDs. It returns all the spans included by each trace. Example:

`trace:span:a7880cc221e84e0d07b15993358811b7,b7880cc221e84e0d07b15993358811b7

See Go documentation for Query

Object

Object represents an OpenTelemetry span

A trace is simply a set of spans with the same trace-id. There is no explicit class or object representing a trace.

See Go documentation for Object

REST API

REST API for the Korrel8r correlation engine.

Version

v1alpha1

License

Apache 2.0

Contact

Project Korrel8r https://github.com/korrel8r/korrel8r

Content negotiation

URI Schemes
  • http

  • https

Consumes
  • application/json

Produces
  • application/json

Endpoints by group

operations

Method URI Name Summary

GET

/api/v1alpha1/domains

get domains

Get name, configuration and status for each domain.

GET

/api/v1alpha1/objects

get objects

Execute a query, returns a list of JSON objects.

POST

/api/v1alpha1/graphs/goals

post graphs goals

Create a correlation graph from start objects to goal queries.

POST

/api/v1alpha1/graphs/neighbours

post graphs neighbours

Create a neighbourhood graph around a start object to a given depth.

POST

/api/v1alpha1/lists/goals

post lists goals

Create a list of goal nodes related to a starting point.

PUT

/api/v1alpha1/config

put config

Change key configuration settings at runtime.

Paths

Get name, configuration and status for each domain.

GET /api/v1alpha1/domains
All responses
Code Status Description Has headers Schema

200

OK

OK

schema

default

schema

Execute a query, returns a list of JSON objects.

GET /api/v1alpha1/objects
Parameters
Name Source Type Go type Separator Required Default Description

query

query

string

string

required

query string

All responses
Code Status Description Has headers Schema

200

OK

OK

schema

default

schema

Create a correlation graph from start objects to goal queries.

POST /api/v1alpha1/graphs/goals
Parameters
Name Source Type Go type Separator Required Default Description

rules

query

boolean

bool

optional

include rules in graph edges

request

body

Goals

models.Goals

search from start to goal classes

All responses
Code Status Description Has headers Schema

200

OK

OK

schema

206

Partial Content

interrupted, partial result

schema

default

schema

Create a neighbourhood graph around a start object to a given depth.

POST /api/v1alpha1/graphs/neighbours
Parameters
Name Source Type Go type Separator Required Default Description

rules

query

boolean

bool

optional

include rules in graph edges

request

body

Neighbours

models.Neighbours

search from neighbours

All responses
Code Status Description Has headers Schema

200

OK

OK

schema

206

Partial Content

interrupted, partial result

schema

default

schema

Create a list of goal nodes related to a starting point.

POST /api/v1alpha1/lists/goals
Parameters
Name Source Type Go type Separator Required Default Description

request

body

Goals

models.Goals

search from start to goal classes

All responses
Code Status Description Has headers Schema

200

OK

OK

schema

default

schema

Change key configuration settings at runtime.

PUT /api/v1alpha1/config
Parameters
Name Source Type Go type Separator Required Default Description

verbose

query

integer

int64

optional

verbose setting for logging

All responses
Code Status Description Has headers Schema

200

OK

OK

schema

default

schema

Models

Constraint

Constraint constrains the objects that will be included in search results.

Properties

Name Type Go type Required Default Description Example

end

date-time (formatted string)

strfmt.DateTime

End of time interval, quoted RFC 3339 format.

limit

integer

int64

Limit number of objects returned per query, ⇐0 means no limit.

start

date-time (formatted string)

strfmt.DateTime

Start of time interval, quoted RFC 3339 format.

timeout

string

string

Timeout per request, h/m/s/ms/ns format

Domain

Domain configuration information.

Properties

Name Type Go type Required Default Description Example

name

string

string

Name of the domain.

stores

[]Store

[]Store

Stores configured for the domain.

Edge

Directed edge in the result graph, from Start to Goal classes.

Properties

Name Type Go type Required Default Description Example

goal

string

string

Goal is the class name of the goal node.

domain:class

rules

[]Rule

[]*Rule

Rules is the set of rules followed along this edge.

start

string

string

Start is the class name of the start node.

Goals

Starting point for a goals search.

Properties

Name Type Go type Required Default Description Example

goals

[]string

[]string

Goal classes for correlation.

["domain:class"]

start

Start

Start

Graph

Graph resulting from a correlation search.

Properties

Name Type Go type Required Default Description Example

edges

[]Edge

[]*Edge

nodes

[]Node

[]*Node

Neighbours

Starting point for a neighbours search.

Properties

Name Type Go type Required Default Description Example

depth

integer

int64

Max depth of neighbours graph.

start

Start

Start

Node

Node in the result graph, contains results for a single class.

Properties

Name Type Go type Required Default Description Example

class

string

string

Class is the full class name in "DOMAIN:CLASS" form.

domain:class

count

integer

int64

Count of results found for this class, after de-duplication.

queries

[]QueryCount

[]*QueryCount

Queries yielding results for this class.

QueryCount

Query run during a correlation with a count of results found.

Properties

Name Type Go type Required Default Description Example

count

integer

int64

Count of results or -1 if the query was not executed.

query

string

string

Query for correlation data.

Rule

Rule is a correlation rule with a list of queries and results counts found during navigation.

Properties

Name Type Go type Required Default Description Example

name

string

string

Name is an optional descriptive name.

queries

[]QueryCount

[]*QueryCount

Queries generated while following this rule.

Start

Start identifies a set of starting objects for correlation.

Properties

Name Type Go type Required Default Description Example

class

string

string

Class for objects

constraint

Constraint

Constraint

objects

interface{}

interface{}

Objects of class serialized as JSON

queries

[]string

[]string

Queries for starting objects

Store

Store is a map of name:value attributes used to connect to a store.

Glossary

The terms used to discuss observability are often used with slightly different meanings in different contexts. This glossary clarifies how Korrel8r uses common terms.

OpenTelemetry means the definition matches how OpenTelemetry uses the term.
icon means the term has a specific meaning for Korrel8r.+
class

icon Has a special meaning for Korrel8, see Overview [[data model]]data model: OpenTelemetry describes the data contents of a signal as a set of named fields. For example a log record might have a "body" field containing the original message, and a "timestamp" field containing the time it was produced.

domain

icon Has a special meaning for Korrel8, see About domains

object

icon Has a special meaning for Korrel8r, see Overview

query

icon Has a special meaning for Korrel8, see Overview rule::icon Has a special meaning for Korrel8, see About rules

resource

OpenTelemetry A source of observable signals. Resources include nodes, pods, hosts and other entities that are described by signals.

signal

icon A single data-item generated by some observable resource, for example a log record, cluster event, or metric time-series.

store

icon Has a special meaning for Korrel8, see Overview


1. This is the same syntax used by the Kubernetes kubectl tool with the --output=template option