Shell scripts to automate daily website access log statistics

Write a shell script that can be used to count the daily access log, and sent e-mail, to facilitate understanding of the site every day situation.
Script statistics:
1, totals
2, the total bandwidth
3, the amount of unique visitors
4, access to IP statistics
5, visit url statistics
6, statistical sources
7,404 Statistics
8, the search engine Access (Google, Baidu)
9. The source search engine statistics (Google, Baidu)

  1. #!/bin/bash
  2. log_path=/home/www.centos.bz/log/access.log.1
  3. domain=”centos.bz”
  4. email=”log@centos.bz”
  5. maketime=date +%Y-%m-%d" "%H":"%M
  6. logdate=date -d "yesterday" +%Y-%m-%d
  7. total_visit=wc -l ${log_path} | awk '{print $1}'
  8. total_bandwidth=awk -v total=0 '{total+=$10}END{print total/1024/1024}' ${log_path}
  9. total_unique=awk '{ip[$1]++}END{print asort(ip)}' ${log_path}
  10. ip_pv=awk '{ip[$1]++}END{for (k in ip){print ip[k],k}}' ${log_path} | sort -rn | head -20
  11. url_num=awk '{url[$7]++}END{for (k in url){print url[k],k}}' ${log_path} | sort -rn | head -20
  12. referer=awk -v domain=$domain '$11 !~ /http:\/\/[^/]*'"$domain"'/{url[$11]++}END{for (k in url){print url[k],k}}' ${log_path} | sort -rn | head -20
  13. notfound=awk '$9 == 404 {url[$7]++}END{for (k in url){print url[k],k}}' ${log_path} | sort -rn | head -20
  14. spider=awk -F'"' '$6 ~ /Baiduspider/ {spider["baiduspider"]++} $6 ~ /Googlebot/ {spider["googlebot"]++}END{for (k in spider){print k,spider[k]}}' ${log_path}
  15. search=awk -F'"' '$4 ~ /http:\/\/www\.baidu\.com/ {search["baidu_search"]++} $4 ~ /http:\/\/www\.google\.com/ {search["google_search"]++}END{for (k in search){print k,search[k]}}' ${log_path}
  16. echo -e “Overview \ n report generation time:${maketime}\nTotal views:${total_visit}\nTotal Bandwidth:${total_bandwidth}M\nIndependent visitors:${total_unique}\n\nAccess to IP statistics\n${ip_pv}\n\n访问url统计\n${url_num}\n\nSource page statistics\n${referer}\n\n404统计\n${notfound}\n\nSpider Statistics\n${spider}\n\nSearch engine sources Statistics \n${search}” | mail -s “$domain $logdate log statistics” ${email}

We need to modify the three variables log_path, domain and email
This script to add a scheduled task, you can receive every day to the statistical data.

Leave a Reply

Your email address will not be published. Required fields are marked *