NetworkOut Complete Guide

Uncategorized

To see exactly what traffic is being served through your ens5 interface (the interface AWS bills for “NetworkOut”), you have two primary options:


Option 1: Use VPC Flow Logs (Recommended for AWS)

VPC Flow Logs will capture all incoming and outgoing network traffic at the ENI (network interface) level—so you’ll see everything going in/out through ens5.

How to Enable and Analyze VPC Flow Logs:

  1. Enable VPC Flow Logs:
    • Go to your AWS VPC Dashboard > “Your VPCs” > Select your VPC.
    • Choose Actions → Create Flow Log.
    • For “Filter”, select All.
    • For “Destination”, choose an S3 bucket or CloudWatch Logs.
    • Create the flow log.
  2. Download/View Flow Logs from S3 or CloudWatch.
  3. Analyze Flows for Outbound Traffic:
    • Each flow log line will include: source IP, destination IP, port, bytes transferred, etc.
    • You can filter for outbound traffic (srcaddr = your EC2’s private IP, or srcport = 443/80).
    Example command (after downloading a log file):grep "172.31.34.212" flowlog.log | awk '$4 == "172.31.34.212" {print $5, $7, $10}' | sort | uniq -c | sort -nr | head -20
    • This shows destination IP, destination port, and bytes sent.

Option 2: Use ss or netstat on the Server

  • ss -tpn (like you used) shows active connections, not historical bandwidth.
  • To see live traffic on ens5: sudo iftop -i ens5
    • This will give you a real-time, interactive view of traffic in/out of ens5.
    • You’ll see which remote IPs are sending/receiving the most data right now.
  • Or, use nload: sudo nload ens5
    • Shows live upload/download speeds for that interface.

Option 3: Use Apache/Nginx Logs

  • As you’ve already done, the logs show what content is being served, but don’t directly tie to interfaces.
  • All public traffic to your EC2 web server goes out through ens5 unless you have multiple network cards or special routing.

Summary Table

ToolShows historic?Shows live?Shows content?AWS integrated?
VPC Flow LogsOnly IP/Port
iftop/nloadNo (just IPs)
Apache LogsYes (URLs)

Typical Workflow to Check “ens5” Traffic

  1. Enable VPC Flow Logs for full history and billing-level analysis.
  2. Use iftop for live monitoring on ens5 (great for troubleshooting spikes).
  3. Correlate with web server logs for content details.

### MAGICAL COMMAND		
grep -o 'GET [^ ]\+ ' access_log | sort | uniq -c		
grep -o 'GET [^ ]\+ ' access_log | sort | uniq -c | sort		
grep -o 'POST [^ ]\+ ' access_log | sort | uniq -c | sort		
		
		
grep -o 'GET [^ ]\+ ' ssl_request_log | sort | uniq -c | sort		
grep -o 'POST [^ ]\+ ' ssl_request_log | sort | uniq -c | sort		
		
------		
grep -o "[0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+" access_log | sort | uniq -c		
awk '{ print $1}' access_log | sort | uniq -c | sort -nr | head -n 10		
-----		
awk '{ print $3 }' ssl_request_log | sort | uniq -c | sort -nr | head -n 10		
		
		
Find the Top IPs by Bandwidth Used		
awk '{ip=$3; size=$NF} {bytes[ip]+=(size~/^[0-9]+$/?size:0)} END {for(i in bytes) print bytes[i],i}' ssl_request_log | sort -nr | head -20		
		
		
Find the Top URLs by Bandwidth Used		
awk '{url=$7; size=$NF} {bytes[url]+=(size~/^[0-9]+$/?size:0)} END {for(i in bytes) print bytes[i],i}' ssl_request_log | sort -nr | head -20		
		
Find Top Downloaded Files		
awk '{url=$7; size=$NF} /\.pdf|\.jpg|\.png|\.zip|\.mp4|\.mov/ {bytes[url]+=(size~/^[0-9]+$/?size:0)} END {for(i in bytes) print bytes[i],i}' ssl_request_log | sort -nr | head -20		
		
Find Top User Agents (bots/browsers) by Requests		
awk -F\" '{print $6}' ssl_request_log | sort | uniq -c | sort -nr | head -20

Only Images, JS, CSS, PDFs, ZIPs, Videos
awk '{url=$7; size=$NF} /\.jpg|\.jpeg|\.png|\.gif|\.svg|\.webp|\.ico|\.js|\.css|\.pdf|\.zip|\.mp4|\.mov|\.avi/ {bytes[url]+=(size~/^[0-9]+$/?size:0)} END {for(i in bytes) print bytes[i], i}' ssl_request_log | sort -nr | head -30

Top Consumers by File Extension (e.g., images only)
awk '{url=$7; size=$NF} /\.jpg|\.jpeg|\.png|\.gif|\.svg|\.webp|\.ico/ {bytes[url]+=(size~/^[0-9]+$/?size:0)} END {for(i in bytes) print bytes[i], i}' ssl_request_log | sort -nr | head -20

JavaScript Only
awk '{url=$7; size=$NF} /\.pdf/ {bytes[url]+=(size~/^[0-9]+$/?size:0)} END {for(i in bytes) print bytes[i], i}' ssl_request_log | sort -nr | head -20

		

Here’s a ready-to-use list of Linux commands you can run on your log files (like ssl_request_log or access_log) to identify which images, JS, PDFs, and other big assets are consuming the most bandwidth.


1. Top Resource URLs by Total Bytes Sent (all file types)

awk '{url=$7; size=$NF} {bytes[url]+=(size~/^[0-9]+$/?size:0)} END {for(i in bytes) print bytes[i], i}' ssl_request_log | sort -nr | head -30
  • What it does: Lists top 30 URLs by total data sent (in bytes).

2. Only Images, JS, CSS, PDFs, ZIPs, Videos

Change the file extensions as needed.

awk '{url=$7; size=$NF} /\.jpg|\.jpeg|\.png|\.gif|\.svg|\.webp|\.ico|\.js|\.css|\.pdf|\.zip|\.mp4|\.mov|\.avi/ {bytes[url]+=(size~/^[0-9]+$/?size:0)} END {for(i in bytes) print bytes[i], i}' ssl_request_log | sort -nr | head -30
  • What it does: Sums up bandwidth for only assets ending in .jpg, .png, .js, .css, .pdf, .zip, .mp4, .mov, etc.
  • You can add/remove file types by editing the regex.

3. Top Consumers by File Extension (e.g., images only)

Images Only

awk '{url=$7; size=$NF} /\.jpg|\.jpeg|\.png|\.gif|\.svg|\.webp|\.ico/ {bytes[url]+=(size~/^[0-9]+$/?size:0)} END {for(i in bytes) print bytes[i], i}' ssl_request_log | sort -nr | head -20

JavaScript Only

awk '{url=$7; size=$NF} /\.js/ {bytes[url]+=(size~/^[0-9]+$/?size:0)} END {for(i in bytes) print bytes[i], i}' ssl_request_log | sort -nr | head -20

PDF Only

awk '{url=$7; size=$NF} /\.pdf/ {bytes[url]+=(size~/^[0-9]+$/?size:0)} END {for(i in bytes) print bytes[i], i}' ssl_request_log | sort -nr | head -20

Video Only (mp4, mov, avi)

awk '{url=$7; size=$NF} /\.mp4|\.mov|\.avi/ {bytes[url]+=(size~/^[0-9]+$/?size:0)} END {for(i in bytes) print bytes[i], i}' ssl_request_log | sort -nr | head -20

4. Top Consumers by Client IP and Asset

Which IP is requesting which large resource the most:

awk '{ip=$3; url=$7; size=$NF} {combo=ip" "url} (size~/^[0-9]+$/) && (url ~ /\.jpg|\.png|\.js|\.css|\.pdf|\.zip|\.mp4|\.mov|\.avi/) {bytes[combo]+=size} END {for(i in bytes) print bytes[i], i}' ssl_request_log | sort -nr | head -30
  • Shows which IP and URL combinations are sending the most data.

5. Top 404/Errors (Optional, helps in asset wastage detection)

awk '$9 ~ /^404$/ {print $7}' ssl_request_log | sort | uniq -c | sort -nr | head -20
  • (If your log format has status code at field $9)

Tips

  • Replace ssl_request_log with your actual log file name if different.
  • Adjust field numbers if your log format is not the default combined log format.
  • For daily/weekly data, filter with grep "2025-07-18" or similar before the awk.

Leave a Reply