Top
«

nginx日志查询top10

内部账号 发布于 阅读:20 私有分类


nginx日志大流量和大并发的查询

当并发高的时候,先统计某一时间段访问最多的ipTOP10都有哪些,然后查看这些ip都访问了什么? 

(1)统计访问最多的ip top10

sed -n '/17:45/,/18:00/p' access_2018-06-30.log|awk '{print $1}'|sort -nr |uniq -c |sort -nr|head -10 
    909 112.14.31.121
    472 123.139.16.179
    423 23.91.100.204
    416 121.30.192.10
    302 101.69.132.78
    281 101.69.132.82
    281 101.69.132.79
    274 101.69.132.81
    269 101.69.132.77
    252 101.69.132.76

(2)查看这些ip访问了些什么

cat access_2018-06-30.log |grep "123.139.16.179"|sort -nr|uniq -c |sort -nr|more

其他的nginx日志各种查询

日志如下:
116.211.124.29 - - [08/Feb/2018:04:58:45 +0800] video.yingyou360.cn "GET /do_not_delete/noc.gif HTTP/1.1" 200 3166 "-" "ChinaCache" "118.118.215.18"

(1)访问量大于1000的ip
awk '{print $1}' access.log | sort -n |uniq -c |awk '{if($1 >1000) print $0}'|sort -rn

(2)当天IP访问量
awk '{print $1}' /alidata/nginx/logs/access.log | sort -n | uniq | wc -l

(3)查询某个 IP 的详细访问情况,按访问频率排序(这里可以使用$1传参)
grep '123.12.11.20' /alidata/nginx/logs/access.log |awk '{print $8}'|sort |uniq -c |sort -rn |head -n 100

(4)查看访问最频的页面(TOP100)
awk '{print $8}' access.log | sort |uniq -c | sort -rn | head -n 100

(5)统计某个网站每分钟的请求数,top100 的时间点(精确到分钟)
awk  '{print $4,$6}' access_2018-09-30.log |cut -c 1-18,22-100|grep "img1.yingyou360.cn"|sort -nr|uniq -c|sort -nr|head -10

(6)查看某个网站每秒的请求数,top100(把访问时间排第一列然后统计在同一时刻该网站被访问了了多少次)
awk '{print $4,$6}' access.log |grep "img1.ying360.cn"|sort|uniq -c|sort -nr|head -n 100

(7)查询当天某个时间段的IP访问量,4-5点
grep "07/Apr/2017:0[4-5]" access.log | awk '{print $1}' | sort | uniq -c| sort -nr | wc -l

(8)查询某个ip每秒钟访问某个网址的并发数(把总数加起来除以10求平均值)
cat access_2018-09-30.log|awk '{print $4,$1,$6}'|grep "112.49.26.41"|grep "img1.yingyou360.cn"|sort -nr|uniq -c|sort -nr|head -10

(9)查看某个时间段流量访问最大的日志信息

116.211.124.29 - - [08/Feb/2018:04:58:45 +0800] video.yingyou360.cn "GET /do_not_delete/noc.gif HTTP/1.1" 200 3166 "-" "ChinaCache" "118.118.215.18"
122.228.115.136 - - [08/Feb/2018:08:42:19 +0800] mobile.yingyou360.cn "GET /blr2/blr2_6.0.7.apk?__=1518050443.777 HTTP/1.1" 200 529909274 "-" "Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0; NetworkBench/8.0.1.323-6674224-2856942) like Gecko" "1.83.205.162, 60.165.55.10"

#把文件大小那一行放第一列,然后对第一列进行排序($11是文件大小,单位是字节,除以2个1024=M),我这里根具访问的文件大小来推算,文件越大消耗的流量也就越大
# sed -n '/8:40/,/8:45/p' access.log |awk '{print $11,$4,$8,$10,$1,$NF}'|sort -nr|uniq -c|more 

其他其他的nginx日志各种查询

[root@jxq-c2-16-1 logs]# cat log.sh 
#!/bin/bash
#统计某个时间段的ip总量 
cat /alidata/nginx/logs/access.log |sed -n '/11\/May\/2017:08:32:00/,/11\/May\/2017:08:35:00/p'| awk '{print $1}' | sort | uniq -c | sort -rn | head
#查看某个时间段针对.apk是否有回源  
#sed -n '/10\/Nov\/2017:17:44:*/,/10\/Nov\/2017:17:58:*/p' access.log|grep -i  "apk" |head -1
sed -n '/17:35/,/18:00/p' access.log|grep -i  "apk" |awk '{print $1,$4,$7,$8,$10,$11,$16}'|more
#查看某个时间段流量访问最大的日志
sed -n '/8:40/,/8:45/p' access.log |awk '{print $11,$4,$8,$10,$1,$NF}'|sort -nr|uniq -c|more
#统计某每分钟ip并发数
cat access_2017-07-26.log |egrep  -v "GET|ChinaCache"|sed -n '/26\/Jul\/2017:16:01:00/,/26\/Jul\/2017:16:59:00/p'|grep "api.mobile.playyx.com" |awk -F "[:| ]"  '{print $1,$4}'|sort |uniq -c | wc -l
#统计某天的总流量
# cat access_2018-10-30.log |awk '{sum=sum+$11} END{print sum/1024^3}'
16.3368
$11是流量列,单位是字节B,除以三个1024得到单位16.3368GB
#统计某天的使用流量峰值
[root@localhost ~]# cat access_2018-10-30.log |awk '{print $11}'|sort -nr|head
683243757
683243757
568808672
370556180
244179717
31216982
27669127
25782956
24267351
21754202
[root@localhost ~]# cat access_2018-10-30.log |awk '{print $11}'|sort -nr|head > ff.txt
[root@localhost ~]# cat access_2018-10-30.log |awk '{print $11/1024^2}'|sort -nr|head
651.592
651.592
542.458
353.39    #去三峰值是353.39M
232.868
29.7708
26.3873
24.5885
23.1431
20.7464
#统计访问量最多的前十个ip,每一个ip一共使用了多少流量
(1)我先列出访问量最多的前十个ip并写入
[root@localhost ~]# cat access_2018-10-30.log |awk '{print $1}'|sort -nr|uniq -c |sort -nr|head|awk '{print $2}' > ip2.txt
[root@localhost ~]# cat ip2.txt   ip2.txt 
23.91.100.204
101.69.132.78
101.69.132.77
101.69.132.82
101.69.132.76
101.69.132.83
101.69.132.80
101.69.132.84
101.69.132.81
101.69.132.79
(2)然后列出ip列和流量列,并筛选出访问量最大的前十个ip对应的ip列和流量列,最后计算每个ip使用了多少流量单位是GB。
[root@localhost ~]# cat ip2.sh   
#!/bin/bash
for i in `cat ip2.txt`
do
    cat access_2018-10-30.log |awk '{print $1,$11}'|grep "$i"|awk '{sum=sum+$2} END{print $1,sum/1024^3}'
done
[root@localhost ~]# sh ip2.sh
23.91.100.204 1.20969
101.69.132.78 0.0809936
101.69.132.77 0.0759132