Graphite is an open-source time-series monitoring system used to collect, store, query, and visualize metrics from servers, applications, databases, and infrastructure.
In simple words:
Graphite stores numerical metrics over time so you can see trends, performance, and problems.
Example metrics Graphite can store:
CPU usage
Memory usage
Disk usage
Network traffic
Application request count
API response time
Error count
Database query time
A sample Graphite metric looks like this:
linux.server1.cpu.usage 72.5 1710000000
Meaning:
| Part | Meaning |
|---|---|
linux.server1.cpu.usage | Metric name/path |
72.5 | Metric value |
1710000000 | Timestamp |
How Graphite works
flowchart TD
A[Server / Application] --> B[Metric Collector]
B --> C[Carbon]
C --> D[Whisper Storage]
D --> E[Graphite Web / API]
E --> F[Grafana Dashboard]flowchart TD
A[Server / Application] --> B[Metric Collector]
B --> C[Carbon]
C --> D[Whisper Storage]
D --> E[Graphite Web / API]
E --> F[Grafana Dashboard]
Main components
| Component | Role |
|---|---|
| Carbon | Receives metrics and writes them to storage |
| Whisper | Stores time-series metrics on disk |
| Graphite Web | Provides UI and API to query metrics |
| Grafana | Often used to create dashboards and alerts from Graphite data |
Where Graphite is used
Graphite is commonly used for:
Infrastructure monitoring
Linux server monitoring
Application performance metrics
Capacity planning
Historical trend analysis
Grafana dashboard backend
Graphite vs Grafana
| Tool | Purpose |
|---|---|
| Graphite | Stores and serves metrics |
| Grafana | Visualizes metrics and creates dashboards/alerts |
So the simplest explanation is:
Graphite is the metrics backend. Grafana is the visualization frontend.
Below are the main components used in Graphite monitoring stack, with a clear data flow diagram and Mermaid architecture diagram.
1. Graphite Components Overview
Graphite is mainly made of these core components:
| Component | Purpose |
|---|---|
| Application / Server / Script | Generates metrics like CPU usage, memory, disk, request count, latency, etc. |
| StatsD / CollectD / Telegraf / Carbon Agent | Collects metrics from servers or applications and forwards them to Graphite. |
| Carbon | Receives metrics, processes them, and writes them into Whisper storage. |
| Carbon Receiver | Listens for incoming metrics over TCP/UDP, usually on port 2003, 2004, or 2023. |
| Carbon Cache | Temporarily stores incoming metrics in memory before writing them to disk. |
| Carbon Relay | Optional component used to forward, split, or replicate metrics to multiple Carbon nodes. |
| Carbon Aggregator | Optional component used to aggregate metrics before storing them. |
| Whisper | Graphite’s time-series database file format. Stores metrics on disk. |
| Graphite Web | Web application used to query, render, and visualize metrics. |
| Graphite API / Render API | Allows tools like Grafana to query Graphite metrics. |
| Grafana | Visualization and dashboarding tool that connects to Graphite as a data source. |
| Alerts | Usually handled in Grafana, not directly by classic Graphite. |
2. Simple Graphite Flow
Application / Server
|
v
Metric Collector
(StatsD / CollectD / Telegraf / Custom Script)
|
v
Carbon
|
v
Whisper Storage
|
v
Graphite Web / Graphite API
|
v
Grafana Dashboard / Alerts
3. Detailed Graphite Architecture Flow
flowchart TD
A[Application / Linux Server / Service] --> B[Metric Collector]
B --> B1[StatsD]
B --> B2[CollectD]
B --> B3[Telegraf]
B --> B4[Custom Script]
B1 --> C[Carbon Receiver]
B2 --> C
B3 --> C
B4 --> C
C --> D[Carbon Cache]
D --> E[Whisper Storage]
E --> F[Graphite Web]
F --> G[Graphite Render API]
G --> H[Grafana]
H --> I[Dashboards]
H --> J[Alerts]
4. Graphite Components with Ports
| Component | Common Port | Description |
|---|---|---|
| Carbon plaintext receiver | 2003 | Receives metrics in plain text format. |
| Carbon pickle receiver | 2004 | Receives metrics in Python pickle format. |
| Carbon cache query port | 7002 | Used internally for cache queries. |
| Graphite Web | 80, 8080, or custom | Web UI and API access. |
| Grafana | 3000 | Grafana web interface. |
| StatsD | 8125/UDP | Receives application metrics and forwards to Graphite. |
5. Metric Flow Example
Example metric:
linux.server1.cpu.usage 72.5 1710000000
This means:
| Part | Meaning |
|---|---|
linux.server1.cpu.usage | Metric path/name |
72.5 | Metric value |
1710000000 | Unix timestamp |
Flow:
Linux Server
sends metric:
linux.server1.cpu.usage 72.5 1710000000
Carbon receives it on port 2003
Carbon Cache stores it temporarily
Whisper writes it into .wsp file
Graphite Web reads it
Grafana displays it in dashboard
6. Full Graphite + Grafana Monitoring Flow
flowchart LR
subgraph Sources["Metric Sources"]
A1[Linux Servers]
A2[Applications]
A3[Databases]
A4[Network Devices]
A5[Custom Scripts]
end
subgraph Collectors["Metric Collectors"]
B1[StatsD]
B2[CollectD]
B3[Telegraf]
B4[Diamond]
end
subgraph Graphite["Graphite Stack"]
C1[Carbon Receiver]
C2[Carbon Cache]
C3[Carbon Relay Optional]
C4[Carbon Aggregator Optional]
D1[Whisper Storage]
E1[Graphite Web]
E2[Graphite Render API]
end
subgraph Visualization["Visualization and Alerting"]
F1[Grafana]
F2[Dashboards]
F3[Alerts]
end
A1 --> B2
A2 --> B1
A3 --> B3
A4 --> B3
A5 --> C1
B1 --> C1
B2 --> C1
B3 --> C1
B4 --> C1
C1 --> C2
C2 --> D1
C1 --> C3
C3 --> C2
C1 --> C4
C4 --> C2
D1 --> E1
E1 --> E2
E2 --> F1
F1 --> F2
F1 --> F3
7. How Each Component Works
Application / Server
This is the original source of metrics.
Examples:
CPU usage
Memory usage
Disk usage
Network traffic
HTTP request count
API latency
Error count
Database query time
The server itself does not usually write directly to Whisper. It sends metrics to Graphite through a collector or script.
Metric Collector
Collectors gather metrics and send them to Carbon.
Common collectors:
| Collector | Use Case |
|---|---|
| StatsD | Application-level metrics such as counters, timers, gauges. |
| CollectD | Linux system metrics such as CPU, memory, disk, network. |
| Telegraf | Modern metrics agent with many plugins. |
| Diamond | Older Python-based Graphite collector. |
| Custom Scripts | Simple scripts that push metrics directly to Carbon. |
Example using shell:
echo "linux.server1.cpu.usage 75 $(date +%s)" | nc 127.0.0.1 2003
Carbon
Carbon is the ingestion engine of Graphite.
It receives metrics and writes them to Whisper.
Carbon has multiple sub-components:
carbon-cache
carbon-relay
carbon-aggregator
Carbon Cache
Carbon Cache receives metrics and keeps them briefly in memory before writing them to disk.
Main responsibilities:
Receive metric data
Buffer data in memory
Apply storage schema
Write data to Whisper files
Serve recent cached data to Graphite Web
Carbon Relay
Carbon Relay is optional.
It is used when Graphite is scaled across multiple servers.
Main uses:
Forward metrics to multiple Carbon caches
Shard metrics across multiple storage nodes
Replicate metrics for high availability
Route metrics based on rules
Example:
server1 metrics -> carbon-cache-1
server2 metrics -> carbon-cache-2
app metrics -> carbon-cache-3
Carbon Aggregator
Carbon Aggregator is optional.
It aggregates many metrics before storage.
Example:
app.server1.requests.count
app.server2.requests.count
app.server3.requests.count
Can be aggregated into:
app.all.requests.count
Useful for reducing query complexity and storage volume.
Whisper
Whisper is Graphite’s storage engine.
It stores each metric as a .wsp file.
Example metric:
linux.server1.cpu.usage
May be stored as:
/opt/graphite/storage/whisper/linux/server1/cpu/usage.wsp
Whisper is similar to RRD storage. It stores fixed-size time-series data based on retention rules.
Example retention:
10s:6h,1m:7d,10m:5y
Meaning:
| Retention | Meaning |
|---|---|
10s:6h | Keep 10-second data for 6 hours |
1m:7d | Keep 1-minute data for 7 days |
10m:5y | Keep 10-minute data for 5 years |
Graphite Web
Graphite Web is the web application and API layer.
It allows users and tools to:
Search metrics
Query metrics
Render graphs
Apply functions
Expose Render API
Example Graphite Render API:
/render?target=linux.server1.cpu.usage&from=-1h&format=json
Grafana
Grafana is commonly used as the visualization layer for Graphite.
Grafana connects to:
http://<graphite-server>
or
http://<graphite-server>:8080
Grafana uses Graphite’s API to search metrics and build dashboards.
Typical Grafana panels:
CPU Usage
Memory Usage
Disk Usage
Network In/Out
Load Average
Process Count
Application Request Rate
Error Rate
Latency
8. Graphite Data Flow with Example
sequenceDiagram
participant App as Application / Linux Server
participant Agent as StatsD / CollectD / Telegraf
participant Carbon as Carbon Receiver
participant Cache as Carbon Cache
participant Whisper as Whisper Storage
participant Web as Graphite Web API
participant Grafana as Grafana
App->>Agent: Generate metrics
Agent->>Carbon: Send metric over TCP/UDP
Carbon->>Cache: Accept and buffer metric
Cache->>Whisper: Write metric to .wsp file
Grafana->>Web: Query metric target
Web->>Whisper: Read historical data
Web->>Cache: Read recent cached data
Web-->>Grafana: Return time-series data
Grafana->>Grafana: Render dashboard / alert
9. End-to-End Example
Suppose a Linux server reports CPU usage.
Step 1: Metric is generated
linux.web01.cpu.usage 65 1710000000
Step 2: Metric is sent to Carbon
echo "linux.web01.cpu.usage 65 $(date +%s)" | nc graphite-server 2003
Step 3: Carbon receives it
Carbon plaintext receiver listens on port 2003
Step 4: Carbon writes to Whisper
/opt/graphite/storage/whisper/linux/web01/cpu/usage.wsp
Step 5: Graphite Web exposes it
http://graphite-server/render?target=linux.web01.cpu.usage&from=-1h
Step 6: Grafana visualizes it
Grafana panel query:
linux.web01.cpu.usage
10. Graphite Stack in One Diagram
flowchart TB
A[Metric Sources] --> B[Metric Collection Layer]
B --> C[Carbon Ingestion Layer]
C --> D[Whisper Storage Layer]
D --> E[Graphite Query Layer]
E --> F[Grafana Visualization Layer]
A1[Linux CPU, Memory, Disk, Network] --> A
A2[Application Counters, Timers, Gauges] --> A
A3[Database Metrics] --> A
A4[Custom Business Metrics] --> A
B1[StatsD] --> B
B2[CollectD] --> B
B3[Telegraf] --> B
B4[Custom Scripts] --> B
C1[carbon-cache] --> C
C2[carbon-relay] --> C
C3[carbon-aggregator] --> C
D1[Whisper .wsp Files] --> D
E1[Graphite Web UI] --> E
E2[Graphite Render API] --> E
E3[Graphite Functions] --> E
F1[Grafana Dashboards] --> F
F2[Grafana Explore] --> F
F3[Grafana Alerts] --> F
11. Important Point for Students
Graphite itself is mainly responsible for:
Receiving metrics
Storing metrics
Querying metrics
Rendering metric data
Grafana is mainly responsible for:
Beautiful dashboards
Explore UI
Alert rules
Notification channels
Dashboard sharing
So in modern monitoring labs:
Graphite = Metrics backend
Grafana = Visualization and alerting frontend
12. Final Summary
Metric Source
↓
Collector or Agent
↓
Carbon Receiver
↓
Carbon Cache
↓
Whisper Storage
↓
Graphite Web / Render API
↓
Grafana
↓
Dashboard / Alert
In simple terms:
Graphite collects and stores time-series metrics. Grafana reads those metrics from Graphite and converts them into dashboards and alerts.
StatsD collects application and system metrics sent by apps, scripts, services, or servers over UDP/TCP. It does not collect logs or traces by default. It mainly collects numeric time-series data.
Types of data collected using StatsD
| Metric type | What it means | Example use case |
|---|---|---|
| Counter | Counts how many times something happened | Number of API requests, login attempts, errors |
| Gauge | Current value at a point in time | Memory usage, queue size, active users |
| Timer | Measures how long something takes | API response time, DB query duration |
| Histogram / Distribution | Measures value distribution | Request latency percentiles like p95, p99 |
| Set | Counts unique values | Unique users, unique IPs, unique sessions |
Examples
1. Counter data
Used to count events.
api.requests:1|c
login.success:1|c
login.failed:1|c
payment.errors:1|c
Meaning:
api.requests increased by 1
login.success increased by 1
payment.errors increased by 1
Common use cases:
Total HTTP requests
Total errors
Total signups
Total orders
Total failed payments
2. Gauge data
Used to send the current value.
queue.size:45|g
memory.used:712|g
active.users:128|g
disk.used.percent:67|g
Common use cases:
CPU usage
Memory usage
Disk usage
Queue depth
Active connections
Number of running jobs
3. Timer data
Used to measure duration.
api.response_time:245|ms
db.query_time:38|ms
cache.lookup_time:4|ms
Meaning:
API response took 245 ms
Database query took 38 ms
Cache lookup took 4 ms
Common use cases:
API latency
Database query latency
External API call duration
File upload time
Job execution time
4. Histogram / distribution data
Used to understand spread of values.
Example:
request.size:2048|h
response.size:5120|h
order.amount:499|h
Common use cases:
Request payload size
Response size
Order amount
Latency distribution
File size distribution
Depending on backend support, this can help calculate:
avg
min
max
p50
p90
p95
p99
5. Set data
Used to count unique values.
unique.users:raj@example.com|s
unique.ips:192.168.1.10|s
unique.sessions:abc123|s
Common use cases:
Unique users
Unique sessions
Unique visitors
Unique IP addresses
Real-world examples of StatsD metrics
For a web application:
web.requests:1|c
web.errors:1|c
web.response_time:180|ms
web.active_users:35|g
web.unique_visitors:visitor123|s
For Linux/server monitoring:
system.cpu.usage:72|g
system.memory.used:8045|g
system.disk.used_percent:61|g
system.loadavg.1min:2.4|g
For business monitoring:
orders.created:1|c
orders.failed:1|c
cart.checkout_time:3200|ms
payment.amount:1299|h
active.subscriptions:840|g
In simple words
StatsD collects data like:
How many times something happened?
What is the current value?
How long did something take?
What is the distribution of values?
How many unique things happened?
So, StatsD is mainly used for:
Application performance monitoring
Infrastructure metrics
Business metrics
Custom service metrics
Dashboarding in Graphite/Grafana
Alerting on abnormal behavior