File: get_started_monitoring.md

package info (click to toggle)
gitlab 17.6.5-19
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 629,368 kB
  • sloc: ruby: 1,915,304; javascript: 557,307; sql: 60,639; xml: 6,509; sh: 4,567; makefile: 1,239; python: 406
file content (203 lines) | stat: -rw-r--r-- 10,468 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
---
stage: Monitor
group: Platform Insights
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://handbook.gitlab.com/handbook/product/ux/technical-writing/#assignments
description: "Get Started monitoring your application"
---

# Get started with monitoring your application in GitLab

Monitoring is a crucial part of maintaining and optimizing your applications.
GitLab observability features help you track errors, analyze application performance, and respond to incidents.

These capabilities are part of the larger DevOps workflow:

![Workflow](img/get_started_monitor_app_v17_3.png)

All of these features can be used independently. For example, you can use
tracing or incidents without using error tracking. However, for the best experience,
use all of these features together.

## Step 1: Determine which project to use

You can use the same project for monitoring that you already use to store your application's source code.

For large applications with multiple services and repositories, you should create a dedicated project
to centralize all telemetry data collected from the different components of the system.
This approach offers several benefits:

- Data is accessible to all development and operations teams, which facilitates collaboration.
- Data from different sources can be queried and correlated in one place, which accelerates investigations.
- It provides a single source of truth for all observability data, making it easier to maintain and update.
- It simplifies access management for administrators by centralizing user permissions in a single project.

To enable observability features, you need administrator or the Owner role for the project.

For more information, see:

- [Create a project](../project/index.md)

## Step 2: Track application errors with error tracking

Error tracking helps you identify, prioritize, and debug errors in your application.
Errors generated by your application are collected by the Sentry SDK,
then stored on either GitLab or Sentry back ends.

For more information, see:

- [How error tracking works](../../operations/error_tracking.md#how-error-tracking-works)

## Step 3: Monitor application performance with tracing, metrics, and logs

### Enable beta features

The following features are available in closed beta:

- [Distributed tracing](../../operations/tracing.md): Follow application requests across multiple services.
- [Metrics](../../operations/metrics.md): Monitor application and infrastructure performance metrics,
  like request latency, traffic, error rate, or saturation.
- [Logs](../../operations/logs.md): Centralize and analyze application and infrastructure logs.

To make these features available, an administrator must [enable the feature flag](../../administration/feature_flags.md)
named `observability_features` for your project or group. After these features are enabled, you can set up data collection.

### Instrument your application with OpenTelemetry

Traces, metrics, and logs are generated from your application and collected
by OpenTelemetry, then stored on the GitLab back end.

[OpenTelemetry](https://opentelemetry.io/docs/what-is-opentelemetry/) is an open-source
observability framework that provides a collection of tools, APIs, and SDKs for generating,
collecting, and exporting telemetry data. The OpenTelemetry Collector is a key component of this framework.

You can collect and send telemetry data to GitLab using either direct instrumentation
or the OpenTelemetry Collector. This table compares the two methods:

| Method | Pros | Cons |
|--------|------|------|
| Direct instrumentation | - Simpler setup<br>- No infrastructure changes| - Less flexible<br>- No data sampling or processing<br>- Can generate high volume of data |
| OpenTelemetry Collector | - Centralized configuration<br>- Enables data sampling and processing<br>- Controlled volume of data | - More complex setup<br>- Requires infrastructure changes |

You should use the OpenTelemetry Collector for most setups, especially if your application
is likely to grow in complexity. However, direct instrumentation can be simpler for testing purposes and small applications.

#### Direct instrumentation

You can instrument your application code to send telemetry data directly to GitLab without using a collector.

Choose a guide based on your programming language or framework:

- [Ruby on Rails](../../tutorials/observability/observability_rails_tutorial.md)
- [Node JS](../../tutorials/observability/observability_nodejs_tutorial.md)
- [Python Django](../../tutorials/observability/observability_django_tutorial.md)
- [Java Spring](../../tutorials/observability/observability_java_tutorial.md)
- [.NET](../../tutorials/observability/observability_dotnet_tutorial.md)

For other languages, use the appropriate [OpenTelemetry API or SDK](https://opentelemetry.io/docs/languages/).

#### Using the OpenTelemetry Collector (recommended)

For complex application setups, you should use the OpenTelemetry Collector.

**What is the OpenTelemetry Collector?**

The [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) acts like proxy that receives, processes, and exports telemetry data from your application to your monitoring tools such as GitLab Observability. It is open source and vendor-neutral, which means you can use it with any compatible tools and avoid vendor lock-in.

Benefits of using the Collector:

- Simplicity: Application services send data to only one destination (the Collector).
- Flexibility: Add or change data destinations from a single place (if you use multiple vendors).
- Advanced features: Sampling, batching, and compression of data.
- Consistency: Uniform data processing.
- Governance: Centralized configuration.

**Configure the OpenTelemetry Collector**

1. [Quick start installation](https://opentelemetry.io/docs/collector/quick-start/)
1. [Choose a deployment method](https://opentelemetry.io/docs/collector/deployment/) (agent or gateway)
1. [Configure data collection](https://opentelemetry.io/docs/collector/configuration/)
   Add the GitLab endpoint as an exporter in the Collector `config.yaml` file:

   ```yaml
   exporters:
     otlphttp/gitlab:
       endpoint: https://gitlab.com/api/v4/projects/<project_id>/observability/
       headers:
         "private-token": "<your_token>"

   service:
     pipelines:
       traces:
         exporters: [spanmetrics, otlphttp/gitlab]
       metrics:
         exporters: [otlphttp/gitlab]
       logs:
         exporters: [otlphttp/gitlab]
   ```

   Replace the placeholders with the following values:

   - `<project_id>`: The project ID. On the project homepage,
     in the upper-right corner, select the vertical ellipsis (**{ellipsis_v}**), then **Copy project ID**.
   - `<your_token>`: An access token created in the project with the `Developer` role and
     `api` scope. Create a token at the project's **Settings** > **Access tokens**.
   - `gitlab.com`: Replace with your GitLab host if running a self-managed instance.

1. Instrument your application to send data to the Collector.
   Use the language-specific guides above, but point to your Collector instead of GitLab.
   For example, if your application and your Collector are on the same host, send your application to this URL:

   ```plaintext
   OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
   ```

### Test your setup

After setting up data collection, you can visualize the collected data in your project by viewing the **Monitor** navigation menu.
Use the **Tracing**, **Metrics**, and **Logs** pages to access this information. These features work together to provide a comprehensive view of your application's health and performance, helping you troubleshoot detected issues.

For more information, see:

- [Distributed tracing](../../operations/tracing.md)
- [Metrics](../../operations/metrics.md)
- [Logs](../../operations/logs.md)

## Step 4: Monitor infrastructure with metrics and logs

To monitor your applications' infrastructure performance and availability,
first install the OpenTelemetry Collector as described previously. Then,
based on your setup, you can use various methods to gather metrics and logs data:

- For host-level, OS metrics: Use the OpenTelemetry Collector with a receiver like
  [Host Metrics](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/hostmetricsreceiver).
  This receiver collects CPU, memory, disk, and network metrics from the host system.
- For cloud-based infrastructure: Use your provider's monitoring solution integrated with OpenTelemetry.
  For example, receivers like [AWS CloudWatch](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/awscloudwatchreceiver) or [Azure Monitor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/azuremonitorreceiver).
- For containerized applications: Use receivers like
  [Docker stats](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/dockerstatsreceiver/) or
  [Kubelet stats](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/kubeletstatsreceiver).
- For Kubernetes clusters: Follow the [external Kubernetes guide](https://opentelemetry.io/docs/kubernetes/getting-started/).

## Step 5: Manage alerts and incidents

Set up incident management features to troubleshoot issues and resolve incidents collaboratively.

For more information, see:

- [Incident Management](../../operations/incident_management/index.md)

## Step 6: Analyze and improve

Use the data and insights gathered to continuously improve your application and the monitoring process:

1. Create insight dashboards to analyze issues
   or incidents created and closed, and assess the performance of your incident response.
1. Create executable runbooks to help engineers on-call remediate incidents autonomously.
1. Regularly review your monitoring setup and adjust sampling thresholds, or add new metrics as your application evolves.
1. Conduct post-incident reviews to identify areas for improvement in both your application and your incident response process.
1. Use the insights gained from monitoring to inform your development priorities and technical debt reduction efforts.

For more information, see:

- [Insight dashboards](../project/insights/index.md)
- [Executable runbooks](../project/clusters/runbooks/index.md)