Complete Tutorial Guide: AWS CloudWatch Agent

Uncategorized

1. What is AWS CloudWatch Agent?

AWS CloudWatch Agent is a software agent installed on servers or compute environments to collect telemetry and send it to Amazon CloudWatch.

It can collect:

TelemetryExample
MetricsCPU, memory, disk, network, process metrics
LogsApplication logs, system logs, web server logs
TracesApplication traces using OpenTelemetry or X-Ray-compatible tracing
Custom metricsBusiness or application-level metrics
StatsD metricsMetrics from apps using StatsD
collectd metricsMetrics from collectd plugins

AWS describes the CloudWatch Agent as a software component that collects metrics, logs, and traces from EC2 instances, on-premises servers, and containerized applications.


2. Why Do We Need CloudWatch Agent?

Some AWS metrics are available automatically. For example, EC2 sends CPU and network metrics to CloudWatch by default.

But some important metrics are not available by default, such as:

MetricAvailable by Default?Needs CloudWatch Agent?
EC2 CPU utilizationYesNo
EC2 network in/outYesNo
EC2 disk read/writeYesNo
EC2 memory usageNoYes
EC2 disk space usageNoYes
Application log filesNoYes
Custom app metricsNoYes
Process-level metricsNoYes
On-prem server metricsNoYes
OpenTelemetry tracesNoYes / collector setup

So, CloudWatch Agent is mainly used when you want deeper visibility inside the operating system and application, not just AWS infrastructure-level metrics.


3. CloudWatch Agent Capabilities

3.1 Collect System Metrics

CloudWatch Agent can collect server-level metrics such as:

CategoryExample Metrics
CPUcpu_usage_idle, cpu_usage_user, cpu_usage_system
Memorymem_used_percent, mem_available
Diskdisk_used_percent, disk_free
Disk I/Oreads, writes, read bytes, write bytes
Networkbytes sent, bytes received, packets
Swapswap_used_percent
Processesprocess count, process CPU, process memory

This is very important for EC2 monitoring because EC2 does not automatically provide memory and disk space metrics.


3.2 Collect Logs

CloudWatch Agent can collect log files and send them to CloudWatch Logs.

Examples:

SourceExample Log Path
Linux system log/var/log/messages
Ubuntu system log/var/log/syslog
Application log/opt/app/logs/app.log
Nginx access log/var/log/nginx/access.log
Nginx error log/var/log/nginx/error.log
Apache logs/var/log/httpd/access_log
Windows Event LogsApplication, System, Security

3.3 Collect Custom Metrics

You can configure the agent to collect custom metrics from:

  • StatsD
  • collectd
  • procstat
  • application metric sources
  • OpenTelemetry metrics

Example custom metric use cases:

  • Number of active users
  • Queue processing delay
  • Payment failures
  • Background job success count
  • Number of running app workers

3.4 Collect Traces

CloudWatch Agent can receive traces from applications using OpenTelemetry Protocol / OTLP and send them to AWS observability services.

This helps with distributed tracing, especially for microservices.


3.5 Support Multiple Environments

CloudWatch Agent can run on:

EnvironmentSupported?
Amazon EC2 LinuxYes
Amazon EC2 WindowsYes
On-premises Linux serversYes
On-premises Windows serversYes
Containerized workloadsYes
Hybrid environmentsYes

4. High-Level Architecture

EC2 / Server / Container
        |
        | CloudWatch Agent
        v
Amazon CloudWatch
        |
        |-- Metrics
        |-- Logs
        |-- Traces
        |-- Alarms
        |-- Dashboards

Example:

Linux EC2 Instance
  ├── CPU metrics
  ├── Memory metrics
  ├── Disk metrics
  ├── /var/log/messages
  ├── /var/log/nginx/access.log
  └── Application logs
        |
        v
CloudWatch Agent
        |
        v
CloudWatch Metrics + CloudWatch Logs

5. Prerequisites Before Installing CloudWatch Agent

Before installation, check these items.

5.1 EC2 Instance Must Exist

You need an EC2 instance, preferably Linux or Windows.

For beginner labs, use Amazon Linux 2, Amazon Linux 2023, Ubuntu, or Windows Server.


5.2 IAM Role Is Required

The EC2 instance needs an IAM role with permission to send data to CloudWatch.

Attach this AWS managed policy to the EC2 instance role:

CloudWatchAgentServerPolicy

If you are creating or storing configuration in Systems Manager Parameter Store, you may also need:

CloudWatchAgentAdminPolicy

For most beginner labs, use:

Use CaseIAM Policy
Agent sends metrics/logs to CloudWatchCloudWatchAgentServerPolicy
Admin creates/stores agent configCloudWatchAgentAdminPolicy

5.3 Systems Manager Agent Recommended

For console-based installation, the EC2 instance should be managed by AWS Systems Manager.

That means:

  • SSM Agent is installed.
  • Instance has an IAM role allowing Systems Manager access.
  • Instance can reach Systems Manager endpoints.
  • Instance appears under Systems Manager managed nodes.

For Amazon Linux and many Windows AMIs, SSM Agent is often already installed.


5.4 Network Access

The instance must be able to reach:

  • CloudWatch
  • CloudWatch Logs
  • Systems Manager, if using SSM
  • Parameter Store, if storing config there

This can be through:

  • Public internet
  • NAT Gateway
  • VPC endpoints

6. Installation Methods

There are two common ways to install CloudWatch Agent:

MethodBest For
AWS Systems Manager consoleConsole-only labs and managed EC2
Manual server installationWhen you have SSH/RDP access

Since your students have AWS Console access only, the best method is:

Install CloudWatch Agent using AWS Systems Manager Run Command from the AWS Console.


7. Step-by-Step: Install CloudWatch Agent Using AWS Console

Goal

Install CloudWatch Agent on an EC2 instance without using CLI or SSH.


Step 1: Attach IAM Role to EC2

  1. Open AWS Console.
  2. Go to EC2.
  3. Click Instances.
  4. Select the target EC2 instance.
  5. Click Actions.
  6. Choose Security.
  7. Choose Modify IAM role.
  8. Attach an IAM role that has:
CloudWatchAgentServerPolicy
AmazonSSMManagedInstanceCore

If no role exists, your instructor or admin must create one.

Required policies:

PolicyPurpose
AmazonSSMManagedInstanceCoreAllows Systems Manager to manage the instance
CloudWatchAgentServerPolicyAllows CloudWatch Agent to publish metrics and logs

Step 2: Confirm Instance Appears in Systems Manager

  1. Open AWS Systems Manager.
  2. In the left menu, go to Node Management.
  3. Click Fleet Manager or Managed Nodes.
  4. Confirm your EC2 instance appears.

If the instance does not appear:

  • IAM role may be missing.
  • SSM Agent may not be installed.
  • Network access may be blocked.
  • Wrong AWS Region may be selected.

Step 3: Use Run Command to Install Agent

  1. Open AWS Systems Manager.
  2. Go to Run Command.
  3. Click Run command.
  4. In Command document, search for:
AWS-ConfigureAWSPackage
  1. Select AWS-ConfigureAWSPackage.
  2. In Command parameters, set:
FieldValue
ActionInstall
NameAmazonCloudWatchAgent
Versionlatest
  1. In Targets, select your EC2 instance.
  2. Scroll down.
  3. Click Run.

Step 4: Verify Installation

  1. Stay on the Run Command result page.
  2. Wait for command status to become:
Success
  1. If it fails, click the command output and check the error.

Common failure reasons:

Error TypePossible Cause
Instance not listedNot managed by Systems Manager
Access deniedIAM role missing permissions
TimeoutNetwork issue
Package failureOS/package issue

8. Configure CloudWatch Agent

Installing the agent is not enough. You must configure what it should collect.

The CloudWatch Agent configuration is a JSON file that defines:

  • Which metrics to collect
  • Which logs to collect
  • Which traces to collect
  • Collection interval
  • Log group names
  • Metric namespace
  • Dimensions such as InstanceId

AWS documentation states that before running the agent, you must create one or more configuration files, and the file specifies metrics, logs, and traces to collect.


9. Step-by-Step: Create Agent Configuration Using Console and Wizard

There are several ways to create the configuration. For beginner training, the easiest is:

Use the CloudWatch Agent configuration wizard through Systems Manager or create a simple JSON configuration manually.

Since you asked for a complete guide, below is a beginner-friendly configuration example.


10. Simple Linux CloudWatch Agent Configuration

This example collects:

  • CPU
  • Memory
  • Disk usage
  • Swap
  • System log
  • Application log
{
  "agent": {
    "metrics_collection_interval": 60,
    "run_as_user": "root"
  },
  "metrics": {
    "namespace": "CWAgent",
    "append_dimensions": {
      "InstanceId": "${aws:InstanceId}",
      "InstanceType": "${aws:InstanceType}",
      "AutoScalingGroupName": "${aws:AutoScalingGroupName}"
    },
    "metrics_collected": {
      "cpu": {
        "measurement": [
          "cpu_usage_idle",
          "cpu_usage_user",
          "cpu_usage_system"
        ],
        "metrics_collection_interval": 60,
        "totalcpu": true
      },
      "mem": {
        "measurement": [
          "mem_used_percent",
          "mem_available"
        ],
        "metrics_collection_interval": 60
      },
      "disk": {
        "measurement": [
          "used_percent",
          "free"
        ],
        "metrics_collection_interval": 60,
        "resources": [
          "*"
        ]
      },
      "swap": {
        "measurement": [
          "swap_used_percent"
        ],
        "metrics_collection_interval": 60
      }
    }
  },
  "logs": {
    "logs_collected": {
      "files": {
        "collect_list": [
          {
            "file_path": "/var/log/messages",
            "log_group_name": "/aws/ec2/system/messages",
            "log_stream_name": "{instance_id}"
          },
          {
            "file_path": "/var/log/cloud-init.log",
            "log_group_name": "/aws/ec2/system/cloud-init",
            "log_stream_name": "{instance_id}"
          }
        ]
      }
    }
  }
}

For Ubuntu, replace:

/var/log/messages

with:

/var/log/syslog

11. Simple Windows CloudWatch Agent Configuration

This example collects:

  • CPU
  • Memory
  • Logical disk usage
  • Windows System Event Log
  • Windows Application Event Log
{
  "agent": {
    "metrics_collection_interval": 60
  },
  "metrics": {
    "namespace": "CWAgent",
    "append_dimensions": {
      "InstanceId": "${aws:InstanceId}",
      "InstanceType": "${aws:InstanceType}"
    },
    "metrics_collected": {
      "Processor": {
        "measurement": [
          "% Processor Time"
        ],
        "resources": [
          "*"
        ],
        "metrics_collection_interval": 60
      },
      "Memory": {
        "measurement": [
          "% Committed Bytes In Use"
        ],
        "metrics_collection_interval": 60
      },
      "LogicalDisk": {
        "measurement": [
          "% Free Space"
        ],
        "resources": [
          "*"
        ],
        "metrics_collection_interval": 60
      }
    }
  },
  "logs": {
    "logs_collected": {
      "windows_events": {
        "collect_list": [
          {
            "event_name": "System",
            "event_levels": [
              "ERROR",
              "WARNING"
            ],
            "log_group_name": "/aws/ec2/windows/system",
            "log_stream_name": "{instance_id}"
          },
          {
            "event_name": "Application",
            "event_levels": [
              "ERROR",
              "WARNING"
            ],
            "log_group_name": "/aws/ec2/windows/application",
            "log_stream_name": "{instance_id}"
          }
        ]
      }
    }
  }
}

12. Store Agent Configuration in Systems Manager Parameter Store

For console-based labs, storing configuration in Parameter Store is useful because Systems Manager can apply it to EC2 instances.

Steps

  1. Open AWS Systems Manager.
  2. Go to Application Management.
  3. Click Parameter Store.
  4. Click Create parameter.
  5. Name the parameter:
/AmazonCloudWatch/linux-basic-config
  1. Type:
String
  1. Data type:
text
  1. Paste the CloudWatch Agent JSON configuration.
  2. Click Create parameter.

13. Start CloudWatch Agent Using Console

After installing the agent and creating configuration, start the agent with Systems Manager Run Command.

Steps

  1. Open AWS Systems Manager.
  2. Go to Run Command.
  3. Click Run command.
  4. Search for command document:
AmazonCloudWatch-ManageAgent
  1. Select it.
  2. Configure parameters:
ParameterValue
Actionconfigure
Modeec2
Optional Configuration Sourcessm
Optional Configuration Location/AmazonCloudWatch/linux-basic-config
Optional Restartyes
  1. In Targets, select your EC2 instance.
  2. Click Run.
  3. Wait for status:
Success

This command applies the configuration and starts or restarts the CloudWatch Agent.


14. Verify Metrics in CloudWatch

Steps

  1. Open CloudWatch.
  2. Go to Metrics.
  3. Click All metrics.
  4. Look for namespace:
CWAgent
  1. Open it.
  2. Look for dimensions such as:
    • InstanceId
    • InstanceType
    • AutoScalingGroupName
  3. Select metrics:
    • mem_used_percent
    • disk_used_percent
    • cpu_usage_user
    • cpu_usage_system
  4. Graph the metrics.

Expected Result

Students should now see memory and disk metrics that were not available earlier under default EC2 metrics.


15. Verify Logs in CloudWatch

Steps

  1. Open CloudWatch.
  2. Go to Logs.
  3. Click Log groups or Log Management.
  4. Look for log groups:
/aws/ec2/system/messages
/aws/ec2/system/cloud-init

For Ubuntu:

/aws/ec2/system/syslog
  1. Open the log group.
  2. Open the log stream.
  3. Confirm log events are arriving.

16. Basic Logs Insights Query

After logs are flowing, open CloudWatch → Logs → Logs Insights.

Select the log group and run:

fields @timestamp, @message
| sort @timestamp desc
| limit 20

Search for errors:

fields @timestamp, @message
| filter @message like /error|ERROR|failed|FAILED|Exception/
| sort @timestamp desc
| limit 20

17. Create an Alarm on Agent Metric

Now create an alarm using a metric collected by CloudWatch Agent.

Example: Memory Usage Alarm

  1. Open CloudWatch.
  2. Go to Alarms.
  3. Click Create alarm.
  4. Click Select metric.
  5. Choose namespace:
CWAgent
  1. Select metric:
mem_used_percent
  1. Select the instance.
  2. Click Select metric.
  3. Set:
    • Statistic: Average
    • Period: 5 minutes
  4. Condition:
  • Greater than 80
  1. Click Next.
  2. Configure notification if allowed.
  3. Name:
Lab-EC2-Memory-High-CloudWatch-Agent
  1. Create alarm.

18. Create a Dashboard for Agent Metrics

Steps

  1. Open CloudWatch.
  2. Go to Dashboards.
  3. Click Create dashboard.
  4. Name:
CloudWatch-Agent-Lab-Dashboard
  1. Add a Line widget.
  2. Choose metrics from namespace:
CWAgent
  1. Add:
    • mem_used_percent
    • disk_used_percent
    • cpu_usage_user
  2. Save widget.
  3. Add alarm widget.
  4. Select the memory alarm.
  5. Save dashboard.

19. Important Configuration Sections Explained

A CloudWatch Agent configuration usually contains these sections.

agent

Controls agent behavior.

Example:

"agent": {
  "metrics_collection_interval": 60,
  "run_as_user": "root"
}

Meaning:

FieldMeaning
metrics_collection_intervalHow often metrics are collected
run_as_userUser used to run the agent

metrics

Defines what metrics to collect.

Example:

"metrics": {
  "namespace": "CWAgent",
  "metrics_collected": {
    "mem": {
      "measurement": [
        "mem_used_percent"
      ]
    }
  }
}

Meaning:

FieldMeaning
namespaceCloudWatch namespace where metrics appear
metrics_collectedList of metric plugins
measurementSpecific metrics to collect

append_dimensions

Adds useful identifying dimensions.

Example:

"append_dimensions": {
  "InstanceId": "${aws:InstanceId}",
  "InstanceType": "${aws:InstanceType}"
}

This makes it easier to filter metrics by EC2 instance.


logs

Defines which log files to collect.

Example:

"logs": {
  "logs_collected": {
    "files": {
      "collect_list": [
        {
          "file_path": "/var/log/messages",
          "log_group_name": "/aws/ec2/system/messages",
          "log_stream_name": "{instance_id}"
        }
      ]
    }
  }
}

Meaning:

FieldMeaning
file_pathLocal log file path
log_group_nameCloudWatch Logs group name
log_stream_nameStream name inside the log group

20. Best Practices

Use Clear Log Group Names

Good:

/aws/ec2/prod/web/system
/aws/ec2/prod/web/nginx-access
/aws/ec2/prod/app/application

Bad:

testlog1
mylog
abc

Use a Consistent Namespace

Default namespace is often:

CWAgent

For teams, you can use:

CompanyName/EC2
ApplicationName/Infrastructure

Avoid Collecting Too Much Data

Every metric and log has cost impact.

Be careful with:

  • Very short collection intervals
  • Too many disk mount points
  • Too many log files
  • Debug logs
  • High-cardinality custom metrics

Set Log Retention

After log groups are created:

  1. Go to CloudWatch → Logs → Log groups.
  2. Select log group.
  3. Choose Retention settings.
  4. Set retention, such as:
    • 7 days
    • 14 days
    • 30 days
    • 90 days

Do not leave unnecessary logs with indefinite retention.


Use Separate Configurations

Use different configs for different server types.

Example:

Server TypeConfig
Web serverNginx logs + system metrics
App serverApplication logs + memory/disk
Database client serverApp logs + process metrics
Windows serverWindows Event Logs + performance counters

21. Common Troubleshooting

Problem: CWAgent Namespace Not Visible

Possible causes:

  • Agent not installed.
  • Agent not started.
  • Wrong Region selected.
  • IAM role missing CloudWatchAgentServerPolicy.
  • Configuration was not applied.
  • No metrics collected yet.

Problem: Logs Not Appearing

Possible causes:

  • Wrong file path.
  • Agent does not have permission to read log file.
  • Log file does not exist.
  • IAM role missing permissions.
  • Wrong Region.
  • Agent not restarted after config change.

Problem: Systems Manager Cannot Find Instance

Possible causes:

  • Instance does not have AmazonSSMManagedInstanceCore policy.
  • SSM Agent not installed.
  • Instance cannot reach Systems Manager endpoint.
  • Wrong AWS Region.
  • Instance is stopped.

Problem: Memory Metrics Missing

Possible causes:

  • Agent not configured to collect memory.
  • Agent config only collects logs.
  • Agent not restarted.
  • Wrong namespace or dimension selected.

22. Beginner Lab Flow

Use this flow for students:

1. Open EC2 and confirm instance exists.
2. Attach IAM role with SSM and CloudWatch Agent permissions.
3. Open Systems Manager and confirm instance is managed.
4. Install CloudWatch Agent using Run Command.
5. Create configuration in Parameter Store.
6. Apply configuration using AmazonCloudWatch-ManageAgent.
7. Open CloudWatch Metrics.
8. Find CWAgent namespace.
9. Graph memory and disk metrics.
10. Open CloudWatch Logs.
11. Verify log groups.
12. Run Logs Insights query.
13. Create memory alarm.
14. Create dashboard.

23. Final Summary

AWS CloudWatch Agent is the main way to collect deeper telemetry from EC2 instances, on-premises servers, and some containerized workloads.

It is used to collect:

  • Memory metrics
  • Disk usage metrics
  • Process metrics
  • System logs
  • Application logs
  • Windows Event Logs
  • Custom metrics
  • StatsD metrics
  • collectd metrics
  • OpenTelemetry metrics and traces

For beginner AWS observability, CloudWatch Agent is especially important because it fills the biggest EC2 monitoring gap:

EC2 gives you CPU and network metrics by default, but CloudWatch Agent gives you memory, disk, logs, and deeper operating-system visibility.

The recommended console-only setup is:

EC2 IAM role → Systems Manager → Install Agent → Store JSON config in Parameter Store → Apply config → Verify CWAgent metrics and logs in CloudWatch.