1. What is AWS CloudWatch Agent?
AWS CloudWatch Agent is a software agent installed on servers or compute environments to collect telemetry and send it to Amazon CloudWatch.
It can collect:
| Telemetry | Example |
|---|---|
| Metrics | CPU, memory, disk, network, process metrics |
| Logs | Application logs, system logs, web server logs |
| Traces | Application traces using OpenTelemetry or X-Ray-compatible tracing |
| Custom metrics | Business or application-level metrics |
| StatsD metrics | Metrics from apps using StatsD |
| collectd metrics | Metrics from collectd plugins |
AWS describes the CloudWatch Agent as a software component that collects metrics, logs, and traces from EC2 instances, on-premises servers, and containerized applications.
2. Why Do We Need CloudWatch Agent?
Some AWS metrics are available automatically. For example, EC2 sends CPU and network metrics to CloudWatch by default.
But some important metrics are not available by default, such as:
| Metric | Available by Default? | Needs CloudWatch Agent? |
|---|---|---|
| EC2 CPU utilization | Yes | No |
| EC2 network in/out | Yes | No |
| EC2 disk read/write | Yes | No |
| EC2 memory usage | No | Yes |
| EC2 disk space usage | No | Yes |
| Application log files | No | Yes |
| Custom app metrics | No | Yes |
| Process-level metrics | No | Yes |
| On-prem server metrics | No | Yes |
| OpenTelemetry traces | No | Yes / collector setup |
So, CloudWatch Agent is mainly used when you want deeper visibility inside the operating system and application, not just AWS infrastructure-level metrics.
3. CloudWatch Agent Capabilities
3.1 Collect System Metrics
CloudWatch Agent can collect server-level metrics such as:
| Category | Example Metrics |
|---|---|
| CPU | cpu_usage_idle, cpu_usage_user, cpu_usage_system |
| Memory | mem_used_percent, mem_available |
| Disk | disk_used_percent, disk_free |
| Disk I/O | reads, writes, read bytes, write bytes |
| Network | bytes sent, bytes received, packets |
| Swap | swap_used_percent |
| Processes | process count, process CPU, process memory |
This is very important for EC2 monitoring because EC2 does not automatically provide memory and disk space metrics.
3.2 Collect Logs
CloudWatch Agent can collect log files and send them to CloudWatch Logs.
Examples:
| Source | Example Log Path |
|---|---|
| Linux system log | /var/log/messages |
| Ubuntu system log | /var/log/syslog |
| Application log | /opt/app/logs/app.log |
| Nginx access log | /var/log/nginx/access.log |
| Nginx error log | /var/log/nginx/error.log |
| Apache logs | /var/log/httpd/access_log |
| Windows Event Logs | Application, System, Security |
3.3 Collect Custom Metrics
You can configure the agent to collect custom metrics from:
- StatsD
- collectd
- procstat
- application metric sources
- OpenTelemetry metrics
Example custom metric use cases:
- Number of active users
- Queue processing delay
- Payment failures
- Background job success count
- Number of running app workers
3.4 Collect Traces
CloudWatch Agent can receive traces from applications using OpenTelemetry Protocol / OTLP and send them to AWS observability services.
This helps with distributed tracing, especially for microservices.
3.5 Support Multiple Environments
CloudWatch Agent can run on:
| Environment | Supported? |
|---|---|
| Amazon EC2 Linux | Yes |
| Amazon EC2 Windows | Yes |
| On-premises Linux servers | Yes |
| On-premises Windows servers | Yes |
| Containerized workloads | Yes |
| Hybrid environments | Yes |
4. High-Level Architecture
EC2 / Server / Container
|
| CloudWatch Agent
v
Amazon CloudWatch
|
|-- Metrics
|-- Logs
|-- Traces
|-- Alarms
|-- Dashboards
Example:
Linux EC2 Instance
├── CPU metrics
├── Memory metrics
├── Disk metrics
├── /var/log/messages
├── /var/log/nginx/access.log
└── Application logs
|
v
CloudWatch Agent
|
v
CloudWatch Metrics + CloudWatch Logs
5. Prerequisites Before Installing CloudWatch Agent
Before installation, check these items.
5.1 EC2 Instance Must Exist
You need an EC2 instance, preferably Linux or Windows.
For beginner labs, use Amazon Linux 2, Amazon Linux 2023, Ubuntu, or Windows Server.
5.2 IAM Role Is Required
The EC2 instance needs an IAM role with permission to send data to CloudWatch.
Attach this AWS managed policy to the EC2 instance role:
CloudWatchAgentServerPolicy
If you are creating or storing configuration in Systems Manager Parameter Store, you may also need:
CloudWatchAgentAdminPolicy
For most beginner labs, use:
| Use Case | IAM Policy |
|---|---|
| Agent sends metrics/logs to CloudWatch | CloudWatchAgentServerPolicy |
| Admin creates/stores agent config | CloudWatchAgentAdminPolicy |
5.3 Systems Manager Agent Recommended
For console-based installation, the EC2 instance should be managed by AWS Systems Manager.
That means:
- SSM Agent is installed.
- Instance has an IAM role allowing Systems Manager access.
- Instance can reach Systems Manager endpoints.
- Instance appears under Systems Manager managed nodes.
For Amazon Linux and many Windows AMIs, SSM Agent is often already installed.
5.4 Network Access
The instance must be able to reach:
- CloudWatch
- CloudWatch Logs
- Systems Manager, if using SSM
- Parameter Store, if storing config there
This can be through:
- Public internet
- NAT Gateway
- VPC endpoints
6. Installation Methods
There are two common ways to install CloudWatch Agent:
| Method | Best For |
|---|---|
| AWS Systems Manager console | Console-only labs and managed EC2 |
| Manual server installation | When you have SSH/RDP access |
Since your students have AWS Console access only, the best method is:
Install CloudWatch Agent using AWS Systems Manager Run Command from the AWS Console.
7. Step-by-Step: Install CloudWatch Agent Using AWS Console
Goal
Install CloudWatch Agent on an EC2 instance without using CLI or SSH.
Step 1: Attach IAM Role to EC2
- Open AWS Console.
- Go to EC2.
- Click Instances.
- Select the target EC2 instance.
- Click Actions.
- Choose Security.
- Choose Modify IAM role.
- Attach an IAM role that has:
CloudWatchAgentServerPolicy
AmazonSSMManagedInstanceCore
If no role exists, your instructor or admin must create one.
Required policies:
| Policy | Purpose |
|---|---|
| AmazonSSMManagedInstanceCore | Allows Systems Manager to manage the instance |
| CloudWatchAgentServerPolicy | Allows CloudWatch Agent to publish metrics and logs |
Step 2: Confirm Instance Appears in Systems Manager
- Open AWS Systems Manager.
- In the left menu, go to Node Management.
- Click Fleet Manager or Managed Nodes.
- Confirm your EC2 instance appears.
If the instance does not appear:
- IAM role may be missing.
- SSM Agent may not be installed.
- Network access may be blocked.
- Wrong AWS Region may be selected.
Step 3: Use Run Command to Install Agent
- Open AWS Systems Manager.
- Go to Run Command.
- Click Run command.
- In Command document, search for:
AWS-ConfigureAWSPackage
- Select AWS-ConfigureAWSPackage.
- In Command parameters, set:
| Field | Value |
|---|---|
| Action | Install |
| Name | AmazonCloudWatchAgent |
| Version | latest |
- In Targets, select your EC2 instance.
- Scroll down.
- Click Run.
Step 4: Verify Installation
- Stay on the Run Command result page.
- Wait for command status to become:
Success
- If it fails, click the command output and check the error.
Common failure reasons:
| Error Type | Possible Cause |
|---|---|
| Instance not listed | Not managed by Systems Manager |
| Access denied | IAM role missing permissions |
| Timeout | Network issue |
| Package failure | OS/package issue |
8. Configure CloudWatch Agent
Installing the agent is not enough. You must configure what it should collect.
The CloudWatch Agent configuration is a JSON file that defines:
- Which metrics to collect
- Which logs to collect
- Which traces to collect
- Collection interval
- Log group names
- Metric namespace
- Dimensions such as InstanceId
AWS documentation states that before running the agent, you must create one or more configuration files, and the file specifies metrics, logs, and traces to collect.
9. Step-by-Step: Create Agent Configuration Using Console and Wizard
There are several ways to create the configuration. For beginner training, the easiest is:
Use the CloudWatch Agent configuration wizard through Systems Manager or create a simple JSON configuration manually.
Since you asked for a complete guide, below is a beginner-friendly configuration example.
10. Simple Linux CloudWatch Agent Configuration
This example collects:
- CPU
- Memory
- Disk usage
- Swap
- System log
- Application log
{
"agent": {
"metrics_collection_interval": 60,
"run_as_user": "root"
},
"metrics": {
"namespace": "CWAgent",
"append_dimensions": {
"InstanceId": "${aws:InstanceId}",
"InstanceType": "${aws:InstanceType}",
"AutoScalingGroupName": "${aws:AutoScalingGroupName}"
},
"metrics_collected": {
"cpu": {
"measurement": [
"cpu_usage_idle",
"cpu_usage_user",
"cpu_usage_system"
],
"metrics_collection_interval": 60,
"totalcpu": true
},
"mem": {
"measurement": [
"mem_used_percent",
"mem_available"
],
"metrics_collection_interval": 60
},
"disk": {
"measurement": [
"used_percent",
"free"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"swap": {
"measurement": [
"swap_used_percent"
],
"metrics_collection_interval": 60
}
}
},
"logs": {
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/messages",
"log_group_name": "/aws/ec2/system/messages",
"log_stream_name": "{instance_id}"
},
{
"file_path": "/var/log/cloud-init.log",
"log_group_name": "/aws/ec2/system/cloud-init",
"log_stream_name": "{instance_id}"
}
]
}
}
}
}
For Ubuntu, replace:
/var/log/messages
with:
/var/log/syslog
11. Simple Windows CloudWatch Agent Configuration
This example collects:
- CPU
- Memory
- Logical disk usage
- Windows System Event Log
- Windows Application Event Log
{
"agent": {
"metrics_collection_interval": 60
},
"metrics": {
"namespace": "CWAgent",
"append_dimensions": {
"InstanceId": "${aws:InstanceId}",
"InstanceType": "${aws:InstanceType}"
},
"metrics_collected": {
"Processor": {
"measurement": [
"% Processor Time"
],
"resources": [
"*"
],
"metrics_collection_interval": 60
},
"Memory": {
"measurement": [
"% Committed Bytes In Use"
],
"metrics_collection_interval": 60
},
"LogicalDisk": {
"measurement": [
"% Free Space"
],
"resources": [
"*"
],
"metrics_collection_interval": 60
}
}
},
"logs": {
"logs_collected": {
"windows_events": {
"collect_list": [
{
"event_name": "System",
"event_levels": [
"ERROR",
"WARNING"
],
"log_group_name": "/aws/ec2/windows/system",
"log_stream_name": "{instance_id}"
},
{
"event_name": "Application",
"event_levels": [
"ERROR",
"WARNING"
],
"log_group_name": "/aws/ec2/windows/application",
"log_stream_name": "{instance_id}"
}
]
}
}
}
}
12. Store Agent Configuration in Systems Manager Parameter Store
For console-based labs, storing configuration in Parameter Store is useful because Systems Manager can apply it to EC2 instances.
Steps
- Open AWS Systems Manager.
- Go to Application Management.
- Click Parameter Store.
- Click Create parameter.
- Name the parameter:
/AmazonCloudWatch/linux-basic-config
- Type:
String
- Data type:
text
- Paste the CloudWatch Agent JSON configuration.
- Click Create parameter.
13. Start CloudWatch Agent Using Console
After installing the agent and creating configuration, start the agent with Systems Manager Run Command.
Steps
- Open AWS Systems Manager.
- Go to Run Command.
- Click Run command.
- Search for command document:
AmazonCloudWatch-ManageAgent
- Select it.
- Configure parameters:
| Parameter | Value |
|---|---|
| Action | configure |
| Mode | ec2 |
| Optional Configuration Source | ssm |
| Optional Configuration Location | /AmazonCloudWatch/linux-basic-config |
| Optional Restart | yes |
- In Targets, select your EC2 instance.
- Click Run.
- Wait for status:
Success
This command applies the configuration and starts or restarts the CloudWatch Agent.
14. Verify Metrics in CloudWatch
Steps
- Open CloudWatch.
- Go to Metrics.
- Click All metrics.
- Look for namespace:
CWAgent
- Open it.
- Look for dimensions such as:
- InstanceId
- InstanceType
- AutoScalingGroupName
- Select metrics:
- mem_used_percent
- disk_used_percent
- cpu_usage_user
- cpu_usage_system
- Graph the metrics.
Expected Result
Students should now see memory and disk metrics that were not available earlier under default EC2 metrics.
15. Verify Logs in CloudWatch
Steps
- Open CloudWatch.
- Go to Logs.
- Click Log groups or Log Management.
- Look for log groups:
/aws/ec2/system/messages
/aws/ec2/system/cloud-init
For Ubuntu:
/aws/ec2/system/syslog
- Open the log group.
- Open the log stream.
- Confirm log events are arriving.
16. Basic Logs Insights Query
After logs are flowing, open CloudWatch → Logs → Logs Insights.
Select the log group and run:
fields @timestamp, @message
| sort @timestamp desc
| limit 20
Search for errors:
fields @timestamp, @message
| filter @message like /error|ERROR|failed|FAILED|Exception/
| sort @timestamp desc
| limit 20
17. Create an Alarm on Agent Metric
Now create an alarm using a metric collected by CloudWatch Agent.
Example: Memory Usage Alarm
- Open CloudWatch.
- Go to Alarms.
- Click Create alarm.
- Click Select metric.
- Choose namespace:
CWAgent
- Select metric:
mem_used_percent
- Select the instance.
- Click Select metric.
- Set:
- Statistic: Average
- Period: 5 minutes
- Condition:
- Greater than 80
- Click Next.
- Configure notification if allowed.
- Name:
Lab-EC2-Memory-High-CloudWatch-Agent
- Create alarm.
18. Create a Dashboard for Agent Metrics
Steps
- Open CloudWatch.
- Go to Dashboards.
- Click Create dashboard.
- Name:
CloudWatch-Agent-Lab-Dashboard
- Add a Line widget.
- Choose metrics from namespace:
CWAgent
- Add:
- mem_used_percent
- disk_used_percent
- cpu_usage_user
- Save widget.
- Add alarm widget.
- Select the memory alarm.
- Save dashboard.
19. Important Configuration Sections Explained
A CloudWatch Agent configuration usually contains these sections.
agent
Controls agent behavior.
Example:
"agent": {
"metrics_collection_interval": 60,
"run_as_user": "root"
}
Meaning:
| Field | Meaning |
|---|---|
| metrics_collection_interval | How often metrics are collected |
| run_as_user | User used to run the agent |
metrics
Defines what metrics to collect.
Example:
"metrics": {
"namespace": "CWAgent",
"metrics_collected": {
"mem": {
"measurement": [
"mem_used_percent"
]
}
}
}
Meaning:
| Field | Meaning |
|---|---|
| namespace | CloudWatch namespace where metrics appear |
| metrics_collected | List of metric plugins |
| measurement | Specific metrics to collect |
append_dimensions
Adds useful identifying dimensions.
Example:
"append_dimensions": {
"InstanceId": "${aws:InstanceId}",
"InstanceType": "${aws:InstanceType}"
}
This makes it easier to filter metrics by EC2 instance.
logs
Defines which log files to collect.
Example:
"logs": {
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/messages",
"log_group_name": "/aws/ec2/system/messages",
"log_stream_name": "{instance_id}"
}
]
}
}
}
Meaning:
| Field | Meaning |
|---|---|
| file_path | Local log file path |
| log_group_name | CloudWatch Logs group name |
| log_stream_name | Stream name inside the log group |
20. Best Practices
Use Clear Log Group Names
Good:
/aws/ec2/prod/web/system
/aws/ec2/prod/web/nginx-access
/aws/ec2/prod/app/application
Bad:
testlog1
mylog
abc
Use a Consistent Namespace
Default namespace is often:
CWAgent
For teams, you can use:
CompanyName/EC2
ApplicationName/Infrastructure
Avoid Collecting Too Much Data
Every metric and log has cost impact.
Be careful with:
- Very short collection intervals
- Too many disk mount points
- Too many log files
- Debug logs
- High-cardinality custom metrics
Set Log Retention
After log groups are created:
- Go to CloudWatch → Logs → Log groups.
- Select log group.
- Choose Retention settings.
- Set retention, such as:
- 7 days
- 14 days
- 30 days
- 90 days
Do not leave unnecessary logs with indefinite retention.
Use Separate Configurations
Use different configs for different server types.
Example:
| Server Type | Config |
|---|---|
| Web server | Nginx logs + system metrics |
| App server | Application logs + memory/disk |
| Database client server | App logs + process metrics |
| Windows server | Windows Event Logs + performance counters |
21. Common Troubleshooting
Problem: CWAgent Namespace Not Visible
Possible causes:
- Agent not installed.
- Agent not started.
- Wrong Region selected.
- IAM role missing CloudWatchAgentServerPolicy.
- Configuration was not applied.
- No metrics collected yet.
Problem: Logs Not Appearing
Possible causes:
- Wrong file path.
- Agent does not have permission to read log file.
- Log file does not exist.
- IAM role missing permissions.
- Wrong Region.
- Agent not restarted after config change.
Problem: Systems Manager Cannot Find Instance
Possible causes:
- Instance does not have AmazonSSMManagedInstanceCore policy.
- SSM Agent not installed.
- Instance cannot reach Systems Manager endpoint.
- Wrong AWS Region.
- Instance is stopped.
Problem: Memory Metrics Missing
Possible causes:
- Agent not configured to collect memory.
- Agent config only collects logs.
- Agent not restarted.
- Wrong namespace or dimension selected.
22. Beginner Lab Flow
Use this flow for students:
1. Open EC2 and confirm instance exists.
2. Attach IAM role with SSM and CloudWatch Agent permissions.
3. Open Systems Manager and confirm instance is managed.
4. Install CloudWatch Agent using Run Command.
5. Create configuration in Parameter Store.
6. Apply configuration using AmazonCloudWatch-ManageAgent.
7. Open CloudWatch Metrics.
8. Find CWAgent namespace.
9. Graph memory and disk metrics.
10. Open CloudWatch Logs.
11. Verify log groups.
12. Run Logs Insights query.
13. Create memory alarm.
14. Create dashboard.
23. Final Summary
AWS CloudWatch Agent is the main way to collect deeper telemetry from EC2 instances, on-premises servers, and some containerized workloads.
It is used to collect:
- Memory metrics
- Disk usage metrics
- Process metrics
- System logs
- Application logs
- Windows Event Logs
- Custom metrics
- StatsD metrics
- collectd metrics
- OpenTelemetry metrics and traces
For beginner AWS observability, CloudWatch Agent is especially important because it fills the biggest EC2 monitoring gap:
EC2 gives you CPU and network metrics by default, but CloudWatch Agent gives you memory, disk, logs, and deeper operating-system visibility.
The recommended console-only setup is:
EC2 IAM role → Systems Manager → Install Agent → Store JSON config in Parameter Store → Apply config → Verify CWAgent metrics and logs in CloudWatch.