pg_statsinfo的架构如下:
分为三个组件:
1. pg_statsinfo
2. pg_reporter
部署在HTML报告服务器上,需要与repository DB通信,与被监控的数据库通信(可选).
3. repository DB
用于存放pg_statsinfo发送过来的snapshot报告。被pg_reporter调用,生产HTML报告。
另外,非常强的一点是可以自己编写模板。
架构如图:
报告分为两类:
第二类是schema,需要有连接到被监控数据库的配置。
下面是statsinfo报告的介绍:
1. Summary
name | 5480307906522906617 | @H_404_43@
---|---|
hostname | db-172-16-3-33.sky-mobi.com.hz | @H_404_43@
port | 1921 | @H_404_43@
pg_version | 9.0beta2 | @H_404_43@
snapshot begin | 2010-06-08 18:04:52 | @H_404_43@
snapshot end | 2010-06-09 13:30:00 | @H_404_43@
snapshot duration | 19:25:09 | @H_404_43@
total database size | 5073 kB | @H_404_43@
total commits | 18698 | @H_404_43@
total rollbacks | 2 | @H_404_43@
2.Database Statistics
ID | database | MB | +MB | commit/s | rollback/s | hit% | gets/s | reads/s | rows/s | @H_404_43@
---|---|---|---|---|---|---|---|---|---|
1 | postgres | 4 | 0 | 0.267 | 0.000 | 99.900 | 17.772 | 0.016 | 95.099 | @H_404_43@
2 | test | 26 | 26 | 0.047 | 0.000 | 99.800 | 23.219 | 0.043 | 82.867 | @H_404_43@
Disk Usage
Disk Usage per Tablespace
ID | tablespace | location | device | used (MB) | avail (MB) | remain% | @H_404_43@
---|---|---|---|---|---|---|
1 | <pg_xlog> | /database/pgdata/tbs2/pg_xlog | 104:33 | 187 | 137594 | 99.864 | @H_404_43@
2 | pg_default | /database/pgdata/tbs1/pg_root | 104:17 | 74 | 137707 | 99.946 | @H_404_43@
3 | pg_global | /database/pgdata/tbs1/pg_root | 104:17 | 74 | 137707 | 99.946 | @H_404_43@
4 | tbs_test | /database/pgdata/tbs4/tbs_test | 104:65 | 86 | 137695 | 99.937 | @H_404_43@
Long Transactions
ID | pid | client address | when to start | duration (sec) | query | @H_404_43@
---|
Notable Tables
Heavily Updated Tables
ID | database | schema | table | INSERT | UPDATE | DELETE | total | HOT% | @H_404_43@
---|---|---|---|---|---|---|---|---|
1 | test | test | tbl_test | 620075 | 0 | 0 | 620075 | @H_404_43@ |
2 | test | pg_toast | pg_toast_2619 | 12 | 0 | 6 | 18 | @H_404_43@ |
3 | test | pg_catalog | pg_attribute | 7 | 0 | 0 | 7 | @H_404_43@ |
4 | test | pg_catalog | pg_shdepend | 4 | 0 | 0 | 4 | @H_404_43@ |
5 | postgres | pg_catalog | pg_shdepend | 4 | 0 | 0 | 4 | @H_404_43@ |
6 | test | pg_catalog | pg_depend | 3 | 0 | 0 | 3 | @H_404_43@ |
7 | test | pg_catalog | pg_statistic | 1 | 2 | 0 | 3 | 50.000 | @H_404_43@
8 | test | pg_catalog | pg_type | 2 | 0 | 0 | 2 | @H_404_43@ |
9 | test | pg_catalog | pg_namespace | 1 | 0 | 0 | 1 | @H_404_43@ |
10 | postgres | pg_catalog | pg_tablespace | 1 | 0 | 0 | 1 | @H_404_43@ |
11 | postgres | pg_catalog | pg_database | 1 | 0 | 0 | 1 | @H_404_43@ |
12 | test | pg_catalog | pg_authid | 1 | 0 | 0 | 1 | @H_404_43@ |
13 | test | pg_catalog | pg_database | 1 | 0 | 0 | 1 | @H_404_43@ |
14 | postgres | pg_catalog | pg_authid | 1 | 0 | 0 | 1 | @H_404_43@ |
15 | test | pg_catalog | pg_tablespace | 1 | 0 | 0 | 1 | @H_404_43@ |
16 | test | pg_catalog | pg_class | 1 | 0 | 0 | 1 | @H_404_43@ |
17 | postgres | pg_catalog | pg_shdescription | 0 | 0 | 0 | 0 | @H_404_43@ |
18 | postgres | pg_catalog | pg_foreign_data_wrapper | 0 | 0 | 0 | 0 | @H_404_43@ |
19 | postgres | pg_catalog | pg_proc | 0 | 0 | 0 | 0 | @H_404_43@ |
20 | postgres | pg_catalog | pg_user_mapping | 0 | 0 | 0 | 0 | @H_404_43@
Heavily Accessed Tables
ID | database | schema | table | seq_scan | seq_tup_read | tup_per_seq | hit% | @H_404_43@
---|
Low Density Tables
ID | database | schema | table | rows | dead rows | pages | rows per page | @H_404_43@
---|---|---|---|---|---|---|---|
1 | test | test | tbl_test | 620075 | 0 | 2744 | 225.975 | @H_404_43@
ID | database | schema | table | column | correlation | @H_404_43@
---|
Checkpoint Activity
total checkpoints | 66 | @H_404_43@
---|---|
checkpoints by time | 64 | @H_404_43@
checkpoints by xlog | 0 | @H_404_43@
avg written buffers | 42.500 | @H_404_43@
max written buffers | 2336.000 | @H_404_43@
avg duration (sec) | 3.721 | @H_404_43@
max duration (sec) | 149.940 | @H_404_43@
Autovacuum Activity
ID | database | schema | table | count | avg index scans | avg removed rows | avg remain rows | avg duration (sec) | max duration (sec) | @H_404_43@
---|
Query Activity
Functions
ID | funcid | name | name | funcname | calls | total time (ms) | self time (ms) | time/call (ms) | @H_404_43@
---|
Statements
ID | user | database | query | calls | total time (sec) | time/call (sec) | @H_404_43@
---|---|---|---|---|---|---|
10 | postgres | postgres | SELECT statsinfo.sample() | 14034 | 0.316 | 0.000 | @H_404_43@
16 | postgres | postgres | SELECT * FROM statsinfo.tablespaces | 283 | 0.082 | 0.000 | @H_404_43@
19 | postgres | postgres | SELECT * FROM statsinfo.activity() | 283 | 0.019 | 0.000 | @H_404_43@
1 | postgres | postgres | SELECT d.oid AS dbid,d.datname,pg_database_size(d.oid),age(d.datfrozenxid),pg_stat_get_db_xact_commit(d.oid) AS xact_commit,pg_stat_get_db_xact_rollback(d.oid) AS xact_rollback,pg_stat_get_db_blocks_fetched(d.oid) - pg_stat_get_db_blocks_hit(d.oid) AS blks_read,pg_stat_get_db_blocks_hit(d.oid) AS blks_hit,pg_stat_get_db_tuples_returned(d.oid) AS tup_returned,pg_stat_get_db_tuples_fetched(d.oid) AS tup_fetched,pg_stat_get_db_tuples_inserted(d.oid) AS tup_inserted,pg_stat_get_db_tuples_updated(d.oid) AS tup_updated,pg_stat_get_db_tuples_deleted(d.oid) AS tup_deleted FROM pg_database d WHERE datallowconn AND datname <> ALL (('{' || $1 || '}')::text[]) ORDER BY 1 | 254 | 2.688 | 0.011 | @H_404_43@
Setting Parameters
ID | name | setting | source | @H_404_43@
---|---|---|---|
1 | TimeZone | PRC | command line | @H_404_43@
2 | checkpoint_segments | 32 | configuration file | @H_404_43@
3 | default_statistics_target | 1000 | configuration file | @H_404_43@
4 | default_text_search_config | pg_catalog.english | configuration file | @H_404_43@
5 | effective_cache_size | 1024000 | configuration file | @H_404_43@
6 | lc_messages | C | configuration file | @H_404_43@
7 | listen_addresses | * | configuration file | @H_404_43@
8 | log_autovacuum_min_duration | 60000 | configuration file | @H_404_43@
9 | log_checkpoints | on | configuration file | @H_404_43@
10 | log_destination | csvlog | override | @H_404_43@
11 | log_directory | /var/applog/pg_log | configuration file | @H_404_43@
12 | log_lock_waits | on | configuration file | @H_404_43@
13 | log_statement | ddl | configuration file | @H_404_43@
14 | log_timezone | PRC | command line | @H_404_43@
15 | log_truncate_on_rotation | on | configuration file | @H_404_43@
16 | logging_collector | on | override | @H_404_43@
17 | max_connections | 1500 | configuration file | @H_404_43@
18 | max_stack_depth | 8192 | configuration file | @H_404_43@
19 | pg_statsinfo.excluded_dbnames | template0,template1 | configuration file | @H_404_43@
20 | pg_statsinfo.repository_server | hostaddr=172.16.3.39 port=1921 database=repo user=statsrepo → dbname=repo host=172.16.3.39 port=1921 user=statsrepo | configuration file | @H_404_43@
21 | random_page_cost | 2 | configuration file | @H_404_43@
22 | server_encoding | UTF8 | override | @H_404_43@
23 | shared_buffers | 192000 | configuration file | @H_404_43@
24 | shared_preload_libraries | pg_statsinfo,pg_stat_statements | configuration file | @H_404_43@
25 | stats_temp_directory | /database/pgdata/tbs3/pg_stat_tmp | configuration file | @H_404_43@
26 | superuser_reserved_connections | 13 | configuration file | @H_404_43@
27 | timezone_abbreviations | Default | command line | @H_404_43@
28 | track_functions | pl | configuration file | @H_404_43@
29 | wal_buffers | 256 | configuration file | @H_404_43@
30 | wal_sync_method | open_sync | configuration file | @H_404_43@
Schema Information
Tables
ID | database | schema | table | columns | row width | MB | +MB | table scans | index scans | @H_404_43@
---|---|---|---|---|---|---|---|---|---|
1 | test | test | tbl_test | 1 | 4 | 21 | 21 | 0 | 0 | @H_404_43@
Indexes
ID | database | schema | index | table | MB | +MB | scans | rows/scan | reads | hits | keys | @H_404_43@
---|
配置非常简单,下面简单的介绍一下配置时的注意事项:
安装需求:
Postgresql 版本
Postgresql 8.3,8.4,9.0
操作系统
RHEL 5.3,CentOS 5.3,Windows XP
连接消耗
1 每个被监控的机器需要消耗1个repo DB连接.
限制:
1. 被监控系统的encoding and lc_messages必须相同
2. 被监控系统的pg_statsrepo.textlog_filename名字必须固定,建议所有监控系统一致.
3. log_timezone 参数必须设置为 unknown,gmt,or utc
4. 错误日志记录
5. 不能采集到shutdown的checkpoint
如果repoDB与被监控的数据库是同一个集群,可能采集不到.
被监控数据库维护
1.-- 删除服务端日志pg_log
2.-- 手工生成snapshot
psql -d postgres -U postgres -c "SELECT statsinfo.snapshot('comment')"
3.-- 回旋日志文件
psql -d postgres -U postgres -c "SELECT pg_rotate_logfile()"
4.-- 重启异常进程(会造成僵死进程)
psql -d postgres -U postgres -c "SELECT statsinfo.restart()"
repo数据库维护
1. Delete Snapshots
psql -d <repository> -c "SELECT statsrepo.del_snapshot('2010-02-01 07:00:00');"