sysdig_usage

sysdig icon

What is sysdig

Sysdig is open source, system-level exploration: capture system state and activity from a running Linux instance, then save, filter and analyze.
Sysdig is scriptable in Lua and includes a command line interface and a powerful interactive UI, csysdig, that runs in your terminal. Think of sysdig as strace + tcpdump + htop + iftop + lsof + awesome sauce.
With state of the art container visibility on top.

sysdig feature


  • open source
  • system-level
  • strace + tcpdump + htop + iftop + lsof + lua
  • support container
  • support offline analysis

Output format

  • default output:(*%evt.num %evt.time %evt.cpu %proc.name (%thread.tid) %evt.dir %evt.type %evt.info)

    21726 10:12:36.851569980 4 hhvm (16343) > stat
    
  • specify output: (*%evt.num %evt.time %evt.cpu %proc.name %proc.pid %evt.type %evt.dir %evt.info)

    36330 10:14:04.974795104 4 hhvm 16176 lstat < res=0 path=/home
    

Trace Files


  • sysdig -w save to file
  • sysdig -r read from file
  • sysdig -s specify how many bytes of each line should be saved
  • sysdig -C specify how many filesize to separate files: Mb
  • sysdig -e specify how many event to separate files
  • sysdig -G specify how many seconds to separate files: second
  • sysdig -W limit file lines, if cap is reached, older files will be overwriten
  • sysdig -z enables compression for tracefiles

Chisels (凿子)


dir

  • global chisels directory: /usr/share/sysdig/chisels
  • personal chisels directory: ~/.chisels

option

  • sysdig -cl get chisels list
  • sysdig -i bottlenecks get chisel information
  • sysdig -c spy_port 80 run chisel spy_port
  • sysdig -c spy_host 1.1.1.1 proc.name=nginx run chisel with filter

filter


sysdig -l get filter list

option


  • -A, –print-ascii
  • -b, –print-base64
  • -D debug
  • -E, –exclude-users
  • -F –fatfile enable fatfile mode, the output file will contain events that will be invisible when reading the file
  • -j –json Emit output as json
  • -i chiselname, –chisel-info=chiselname Get a longer description and the arguments associated with a chisel
  • -L, –list-events List the events that the engine supports
  • -l –list List the fields that can be used for filtering and output formatting. Use -lv to get additional information for each field.
  • -n num, –numevents=num Stop capturing after num events
  • -p outputformat, –print=outputformat Specify the format to be used when printing the events.
  • -r readfile, –read=readfile Read the events from readfile
  • -S, –summary print the event summary
  • -t timetype, –timetype=timetype specify time
    • h: human-readable string
    • a: absolute timestamp from epoch
    • r: relative time from the beginning of the capture
    • d: delta betweenevent enter and exit
    • D: delta from the previous event
  • -v, –verbose
  • -w writefile, –write=writefile

实战


基本使用

  • 查看调用某个文件的进程
    • sysdig fd.name=/etc/resolv.conf
  • 查看进程名包含agent的调用
    • sysdig proc.name contains agent
  • 监控用户操作
    • sysdig -c spy_users “user.name=work”

性能分析

  • 查看操作文件大于100ms的进程
    • sysdig -c fileslower 100
  • 查看大于1s的网络请求
    • sysdig -c netlower 1000
  • 查看nginx进程的执行时间
    • sysdig -c proc_exec_time proc.name=nginx
  • 查看hhvm大于1s的系统调用
    • sysdig -c scallslower 1000 proc.name=hhvm
  • 查看hhvm耗时的系统调用
    • sysdig -c topscalls_time proc.name=hhvvm
  • 查看cpu0占用资源最多的进程
    • sysdig -c topprocs_cpu evt.cpu=0
  • 查看占用流量最大的端口
    • sysdig -c topports_server
  • 查看占用流量最大的进程
    • sysdig -c topprocs_net
  • 查看io最高的文件
    • sysdig -c topfiles_bytes
  • 查看io最高的进程
    • sysdig -c topprocs_file

HTTP抓包

  • 查看所有的http请求
    • sysdig -c httplog
  • 查看http url top统计
    • sysdig -c httptop
  • 查看8080端口post的请求
    • sysdig -A -c echo_fds fd.port=8080 and evt.buffer contains POST
  • 查看nginx进程的accept链接
    • sysdig proc.name=nginx and evt.type=accept

custom Chisels


参考

功能

  • 统计http流量的QPS

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
-- Chisel description
description = "Show qps for all HTTP requests";
short_description = "HTTP QPS count";
category = "Net";
args = {}

require "http"

-- Initialization callback
function on_init()
http_init()
-- The -pc or -pcontainer options was supplied on the cmd line
--print_container = sysdig.is_print_container_data()

return true
end

qps_table = {}
time_tmp = os.date("%Y-%m-%d %H:%M:%S", os.time())

function string_split(s,p)
local rt= {}
string.gsub(s, '[^'..p..']+', function(w) table.insert(rt, w) end )
return rt
end

function on_transaction(transaction)
if qps_table[time_tmp] == nil then
qps_table[time_tmp] = 1
end
if time_now ~= time_tmp then
print(string.format("%s %10d",
time_tmp,
qps_table[time_tmp]
))
table.remove(qps_table, 1)
time_tmp = time_now
else
qps_table[time_tmp] = qps_table[time_tmp] + 1
end
end

function on_event()
time_now = string_split(evt.field(datetime_field), ".")[1]
run_http_parser(evt, on_transaction)
end

演示

sysdig http_qps

Q&A