Postgres Notes: Difference between revisions
(59 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
== | == Postgres cli commands == | ||
'''Wide output''' ( a la MYSQL's \G ) | |||
Good for seeing examples of FEW records: | Good for seeing examples of FEW records: | ||
Line 5: | Line 7: | ||
\x | \x | ||
select * from users limit 1; | select * from users limit 1; | ||
It's a toggle so do it again on it's own and it gets turned off. | |||
list databases: | |||
\l | |||
List tables: | |||
\dt | |||
databases by size: | |||
\l+ | |||
\l+ PGDATABASE | |||
change db after connect with | |||
\c databasename | |||
list all databases | |||
\list or \l | |||
list all tables in the current database | |||
\dt | |||
list relations ships ( tables and sequences ) | |||
\d | |||
== An App user == | == An App user == | ||
Line 22: | Line 55: | ||
full treatment: https://stackoverflow.com/questions/41537825/restricted-postgresql-permissions-for-web-app/47955583#47955583 | full treatment: https://stackoverflow.com/questions/41537825/restricted-postgresql-permissions-for-web-app/47955583#47955583 | ||
=== User Access Permissions Report === | |||
To get a view of user permissions across the whole enchilada for user "myusername": | |||
<pre> | |||
SELECT | |||
u.usename, | |||
s.tblname, | |||
has_table_privilege(u.usename,s.tblname,'SELECT') AS user_has_select_permission, | |||
has_table_privilege(u.usename,s.tblname,'INSERT') AS user_has_insert_permission, | |||
has_table_privilege(u.usename,s.tblname,'UPDATE') AS user_has_update_permission, | |||
has_table_privilege(u.usename,s.tblname,'DELETE') AS user_has_delete_permission, | |||
has_table_privilege(u.usename,s.tblname,'REFERENCES') AS user_has_references_permission | |||
FROM | |||
pg_user u | |||
CROSS JOIN | |||
(SELECT DISTINCT schemaname+'.'+tablename as tblname FROM pg_tables where schemaname not in ('pg_catalog', 'public')) s | |||
where u.usename = 'myusername' | |||
--and user_has_select_permission = false | |||
order by 3, 2; | |||
</pre> | |||
=== READ only user === | |||
-- Create a group | |||
CREATE ROLE readaccess; | |||
-- Grant access to existing tables | |||
GRANT USAGE ON SCHEMA public TO readaccess; | |||
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readaccess; | |||
-- Grant access to future tables | |||
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO readaccess; | |||
-- Create a final user with password | |||
CREATE USER tomek WITH PASSWORD 'secret'; | |||
GRANT readaccess TO tomek; | |||
=== Setting Default permissions for a read only user === | |||
For subsequent tables with no Schema scope defined: | |||
ALTER DEFAULT PRIVILEGES GRANT SELECT ON TABLES TO myusername; | |||
Changes current tables and not subsequent: | |||
GRANT SELECT ON ALL TABLES IN SCHEMA myschema TO myusername; | |||
=== Another read only user === | |||
postgres=> \du+ pg_monitor | |||
<pre> | |||
postgres=> create role user_monitor password 'XXX' login; | |||
CREATE ROLE | |||
postgres=> grant pg_monitor to user_monitor ; | |||
GRANT ROLE | |||
postgres=> \du+ user_monitor | |||
List of roles | |||
Role name | Attributes | Member of | Description | |||
--------------+------------+--------------+------------- | |||
user_monitor | | {pg_monitor} | | |||
</pre> | |||
== Restore and backup == | |||
as postgres user | |||
restore: | |||
psql dbname < infile | |||
backup: | |||
/bin/pg_dump dbname > /tmp/dbname.backup.date | |||
Bigger script: | |||
<pre> | |||
#!/bin/sh -x | |||
DBNAME="dbname" | |||
DIR=/data/database_backups | |||
NOW=`date "+%Y%m%d%H%M%S"` | |||
# binaries | |||
# stand alone postgres | |||
PGDUMP="/bin/pg_dump" | |||
PSQL="/bin/psql" | |||
PGCONF="/var/lib/pgsql/data/postgresql.conf" | |||
echo pg_dump | |||
${PGDUMP} ${DBNAME} | /bin/gzip > ${DIR}/vmdb.pg_dump.${NOW}.gz | |||
echo postgresql.conf | |||
/bin/cp ${PGCONF} ${DIR}/postgresql.${NOW}.conf | |||
echo server vars | |||
echo "SHOW ALL" | ${PSQL} > ${DIR}/psql_SHOW_ALL_${NOW}.out</pre> | |||
</pre> | |||
UPDATE 2019 style! | |||
pg_dump -h host -p 5432 -U user -Fd -j 16 -b -v -f "/tmp/filename" db_name | |||
<pre> | |||
pg_dump | |||
-h host # host | |||
-p 5432 # port | |||
-U user # user | |||
-Fd # output format = custom ( fit for pg_restore command ). | |||
-j 16 # concurrency control | |||
-b # include large objects | |||
-v # be verbose | |||
-f "/tmp/db_name_dir" db_name | |||
</pre> | |||
== RDS Gochas == | == RDS Gochas == | ||
Line 28: | Line 180: | ||
https://stackoverflow.com/questions/26684643/error-must-be-member-of-role-when-creating-schema-in-postgresql | https://stackoverflow.com/questions/26684643/error-must-be-member-of-role-when-creating-schema-in-postgresql | ||
'''Adding GIS support ''' | |||
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.html#Appendix.PostgreSQL.CommonDBATasks.PostGIS | |||
== pg_stat_statements based reports == | |||
https://gist.github.com/NikolayS/b6ec676ea63ab8d4db680a4c0d88e8bf | |||
== pg_stat_activity based reports == | |||
install this extension pg_stat_statements using | |||
tenant=> create extension pg_stat_statements; | |||
CREATE EXTENSION | |||
then query: | |||
SELECT calls, total_time,total_time::decimal/calls avg, query FROM pg_stat_statements | |||
ORDER BY total_time DESC LIMIT 10; | |||
== per table cache hit ratios == | == per table cache hit ratios == | ||
Line 46: | Line 221: | ||
dont forget to select pg_stat_reset(); | dont forget to select pg_stat_reset(); | ||
<pre> | |||
SELECT | |||
relname as table , | |||
heap_blks_read as heap_read, | |||
heap_blks_hit as heap_hit, | |||
to_char( ( heap_blks_hit::decimal / ( heap_blks_hit::decimal + heap_blks_read::decimal ) ) * 100 , '99.99' ) as ratio | |||
FROM | |||
pg_statio_user_tables | |||
WHERE | |||
heap_blks_hit + heap_blks_read > 0 | |||
AND relid > 10000 | |||
ORDER BY ratio; | |||
</pre> | |||
== percent of time index used == | == percent of time index used == | ||
Line 62: | Line 251: | ||
</pre> | </pre> | ||
== | == Process list stuff == | ||
SELECT client_addr , count(1) from pg_stat_activity group by client_addr; | |||
postgres 9.5 .9.6, others? | |||
SELECT pid, datname, usename, query_start, state_change, waiting , state | |||
from pg_stat_activity ; | |||
SELECT min(query_start) as oldest, max(query_start) as newest, state, count(1) | |||
from pg_stat_activity | |||
GROUP BY state ; | |||
=== postgres 11 === | |||
SELECT pid, usename, client_addr , query_start, backend_start, state_change, wait_event_type, wait_event, state, backend_type , substring( query , 0, 20 ) as query | |||
from pg_stat_activity ; | |||
prometheus | |||
SELECT pid, usename, client_addr , query_start, backend_start, state_change, wait_event_type, wait_event, state, backend_type , query | |||
from pg_stat_activity | |||
where usename = 'prometheus'; | |||
select * from pg_stat_bgwriter; | |||
== Vacuum == | |||
<pre> | |||
with base as(SELECT round(n_dead_tup / (n_live_tup * current_setting('autovacuum_vacuum_scale_factor')::float8 + current_setting('autovacuum_vacuum_threshold')::float8)::numeric, 3) fraqmentation | |||
, schemaname | |||
, relname | |||
, greatest(last_vacuum, last_autovacuum) AT TIME ZONE 'EDT' last_vacuum | |||
, case when coalesce(last_vacuum,'1970-01-01') > coalesce(last_autovacuum,'1970-01-01') then 'Job' when coalesce(last_vacuum,'1970-01-01') < coalesce(last_autovacuum,'1970-01-01') then 'Auto' else null end vaccum_type | |||
, greatest(last_analyze, last_autoanalyze) AT TIME ZONE 'EDT' last_analyze | |||
, case when coalesce(last_analyze,'1970-01-01') > coalesce(last_autoanalyze,'1970-01-01') then 'Job' when coalesce(last_analyze,'1970-01-01') < coalesce(last_autoanalyze,'1970-01-01') then 'Auto' else null end analyze_type | |||
, n_live_tup live_tuple | |||
, n_dead_tup dead_tuple | |||
, round(n_dead_tup / greatest(n_live_tup,1)::numeric,4) ratio | |||
FROM pg_stat_all_tables | |||
WHERE schemaname = 'public') | |||
select * from base | |||
where relname like 'steps_2019_0%' | |||
ORDER BY fraqmentation DESC; | |||
</pre> | |||
( might not owrk with postgres 11 ) | |||
=== Global settings === | |||
select * from pg_settings where category like 'Autovacuum'; | |||
=== per table settings === | |||
<pre> | |||
select relname, reloptions, pg_namespace.nspname | |||
from pg_class | |||
join pg_namespace on pg_namespace.oid = pg_class.relnamespace | |||
where pg_namespace.nspname = 'public'; | |||
select relname, substring(reloptions::varchar,0,70) as options, pg_namespace.nspname | |||
from pg_class | |||
join pg_namespace on pg_namespace.oid = pg_class.relnamespace | |||
where pg_namespace.nspname = 'public' | |||
ORDER BY reloptions; | |||
</pre> | |||
=== what vaccum is running now ? === | |||
possibly old version: | |||
<pre> | |||
select | |||
datname, | |||
usename, | |||
pid, | |||
state, | |||
wait_event, | |||
current_timestamp - xact_start AS xact_runtime, | |||
query | |||
from pg_stat_activity | |||
where upper(query) LIKE '%VACUUM%' and | |||
pid != pg_backend_pid() | |||
order by xact_start; | |||
</pre> | |||
new ? | |||
<pre> | |||
select | |||
datname, | |||
usename, | |||
pid, | |||
state, | |||
waiting, | |||
current_timestamp - xact_start AS xact_runtime, | |||
query | |||
from pg_stat_activity | |||
where upper(query) LIKE '%VACUUM%' and | |||
pid != pg_backend_pid() | |||
order by xact_start; | |||
</pre> | |||
pg 11: | |||
select * from pg_stat_progress_vacuum; | |||
=== current vaccuum config === | |||
SELECT relname, reltuples , reloptions FROM pg_class ORDER BY reloptions ASC ,relname; | |||
doesn't work on pg 11 | |||
=== disk usage === | |||
<pre> | |||
SELECT | |||
relname AS "table_name", | |||
pg_size_pretty(pg_table_size(C.oid)) AS "table_size" | |||
FROM | |||
pg_class C | |||
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) | |||
WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND nspname !~ '^pg_toast' AND relkind IN ('r') | |||
ORDER BY pg_table_size(C.oid) | |||
DESC LIMIT 20; | |||
</pre> | |||
=== stats === | |||
<pre> | <pre> | ||
Line 76: | Line 393: | ||
last_autovacuum ASC; | last_autovacuum ASC; | ||
</pre> | </pre> | ||
<pre> | |||
SELECT | |||
relname, | |||
last_vacuum, | |||
last_autovacuum, | |||
vacuum_count, | |||
autovacuum_count | |||
FROM | |||
pg_stat_user_tables | |||
ORDER BY | |||
last_autovacuum DESC; | |||
</pre> | |||
also see: https://www.datadoghq.com/blog/postgresql-vacuum-monitoring/ | |||
can also use "pg_stat_all_tables" | |||
also see: https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-ALL-TABLES-VIEW | |||
possibly also from the log: grep -A 5 -i vacuum postgresql-201*.log| awk 'BEGIN{RS="--"}{gsub("\n"," ",$0); print $0 } | |||
=== transacion id wrap around === | |||
vacccum might happen becasue of this: | |||
reference; https://www.cybertec-postgresql.com/en/autovacuum-wraparound-protection-in-postgresql/ | |||
<pre> | |||
SELECT | |||
oid::regclass::text AS table, | |||
age(relfrozenxid) AS xid_age, | |||
mxid_age(relminmxid) AS mxid_age, | |||
least( | |||
(SELECT setting::int | |||
FROM pg_settings | |||
WHERE name = 'autovacuum_freeze_max_age') - age(relfrozenxid), | |||
(SELECT setting::int | |||
FROM pg_settings | |||
WHERE name = 'autovacuum_multixact_freeze_max_age') - mxid_age(relminmxid) | |||
) AS tx_before_wraparound_vacuum, | |||
pg_size_pretty(pg_total_relation_size(oid)) AS size, | |||
pg_stat_get_last_autovacuum_time(oid) AS last_autovacuum | |||
FROM pg_class | |||
WHERE relfrozenxid != 0 | |||
AND oid > 16384 | |||
ORDER BY tx_before_wraparound_vacuum; | |||
</pre> | |||
=== Datadog Dashboard === | |||
This is a great dashboard: | |||
* service latency ( the front side view ) not really post gres but the thing you don't want impacted by activity in the db. | |||
* Locks by table | |||
* Dead Row by Table | |||
* Lock by Mode | |||
* Table Size by Table | |||
* Live rows | |||
* IOPS | |||
* CPU Utiliation | |||
== Replication == | |||
=== RDS replication === | |||
permissions get replicated. | |||
create a user on the master , wait a bit and the reader will have those same permissions. | |||
== Performance == | |||
You should do this for your queries: | |||
EXPLAIN (ANALYSE,VERBOSE,BUFFERS,JSON) SELECT * FROM fake_data LIMIT 10 OFFSET 200 | |||
It can highlight issues. | |||
wrong default_statistics_target ? | |||
wrong work_mem ? | |||
=== Join and sub queries cost the same now === | |||
<pre> | |||
EXPLAIN (ANALYZE, BUFFERS FALSE, TIMING FALSE, SUMMARY FALSE) | |||
SELECT titles_translations.* | |||
FROM titles JOIN titles_translations USING(title_id) | |||
WHERE titles.type = 'movie' AND titles.primary_name = 'How to Train Your Dragon'; | |||
EXPLAIN (ANALYZE, BUFFERS FALSE, TIMING FALSE, SUMMARY FALSE) | |||
SELECT * FROM titles_translations | |||
WHERE title_id IN( SELECT title_id FROM titles | |||
WHERE type = 'movie' AND primary_name = 'How to Train Your Dragon' ); | |||
EXPLAIN (ANALYZE, BUFFERS FlaISE, TIMING FALSE, SUMMARY FALSE) | |||
SELECT * FROM titles_transliations | |||
WHERE EXISTS( SELECT * FROM titles | |||
WHERE title_id = titles_translations.title_id AND type = 'movie' AND primary_name = 'How to Train Your Dragon' ); | |||
</pre> | |||
== Dead tuples report == | |||
SELECT relname, n_dead_tup FROM pg_stat_user_tables order by n_dead_tup desc limit 30; | |||
would there be a vacuum ? | |||
<pre> | |||
SELECT * FROM pg_stat_all_tables | |||
ORDER BY n_dead_tup | |||
/ (n_live_tup | |||
* current_setting('autovacuum_vacuum_scale_factor')::float8 | |||
+ current_setting('autovacuum_vacuum_threshold')::float8) | |||
DESC | |||
LIMIT 10; | |||
</pre> | |||
== Lock report == | |||
tip: how to exclude your own query: | |||
WHERE a.pid != pg_backend_pid() | |||
<pre> | |||
SELECT a.datname, | |||
l.relation::regclass, | |||
l.mode, | |||
l.GRANTED, | |||
a.usename, | |||
substring(a.query,0,80) as query, | |||
a.query_start, | |||
age(now(), a.query_start) AS "age", | |||
a.pid | |||
FROM pg_stat_activity a | |||
JOIN pg_locks l ON l.pid = a.pid | |||
WHERE a.pid != pg_backend_pid() | |||
ORDER BY a.query_start; | |||
</pre> | |||
long running locking queries: | |||
<pre> | |||
select pid, state, usename, query, query_start, now() - query_start as elapse | |||
from pg_stat_activity | |||
where pid in ( | |||
select pid | |||
from pg_locks l | |||
join pg_class t on l.relation = t.oid | |||
where t.relkind = 'r' AND | |||
database = 5302651 | |||
) | |||
ORDER BY elapse desc LIMIT 1; | |||
</pre> | |||
useful walk through of locking: | |||
https://jaketrent.com/post/find-kill-locks-postgres | |||
== Disk usage == | |||
By table: | |||
<Pre> | |||
SELECT | |||
relname as "Table", | |||
pg_size_pretty(pg_total_relation_size(relid)) As "Size", | |||
pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as "External Size" | |||
FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC; | |||
</pre> | |||
https://wiki.postgresql.org/wiki/Disk_Usage | |||
<pre> | |||
SELECT *, pg_size_pretty(total_bytes) AS total | |||
, pg_size_pretty(index_bytes) AS INDEX | |||
, pg_size_pretty(toast_bytes) AS toast | |||
, pg_size_pretty(table_bytes) AS TABLE | |||
FROM ( | |||
SELECT *, total_bytes-index_bytes-COALESCE(toast_bytes,0) AS table_bytes FROM ( | |||
SELECT c.oid,nspname AS table_schema, relname AS TABLE_NAME | |||
, c.reltuples AS row_estimate | |||
, pg_total_relation_size(c.oid) AS total_bytes | |||
, pg_indexes_size(c.oid) AS index_bytes | |||
, pg_total_relation_size(reltoastrelid) AS toast_bytes | |||
FROM pg_class c | |||
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace | |||
WHERE relkind = 'r' | |||
) a | |||
) a; | |||
</pre> | |||
<pre> | |||
SELECT nspname || '.' || relname AS "relation", | |||
pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size" | |||
FROM pg_class C | |||
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) | |||
WHERE nspname NOT IN ('pg_catalog', 'information_schema') | |||
AND C.relkind <> 'i' | |||
AND nspname !~ '^pg_toast' | |||
ORDER BY pg_total_relation_size(C.oid) DESC | |||
LIMIT 5; | |||
<pre> | |||
== cache hit ration summary == | == cache hit ration summary == | ||
Line 135: | Line 658: | ||
, '1 day'::interval) day | , '1 day'::interval) day | ||
</pre> | </pre> | ||
SELECT generate_series(1,10) AS id, md5(random()::text) AS descr; | |||
=== Select === | === Select === | ||
Line 146: | Line 672: | ||
, max(id) - min(id) AS id_span | , max(id) - min(id) AS id_span | ||
FROM mytable ; | FROM mytable ; | ||
<pre> | </pre> | ||
let us suppose that your span is 5100000 and that you distribution of id is dense | let us suppose that your span is 5100000 and that you distribution of id is dense | ||
Line 165: | Line 691: | ||
LIMIT 1000; | LIMIT 1000; | ||
</pre> | </pre> | ||
== Also See == | |||
* [[Sql of note]] | |||
* https://gist.github.com/NikolayS/b6ec676ea63ab8d4db680a4c0d88e8bf | |||
* [https://www.citusdata.com/blog/2019/02/08/the-most-useful-postgres-extension-pg-stat-statements/ The most useful Postgres extension: pg_stat_statements] | |||
* [https://gist.github.com/NikolayS/b6ec676ea63ab8d4db680a4c0d88e8bf Slowest query et al] | |||
* https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT | |||
* Why upgrade postgres https://why-upgrade.depesz.com/show?from=12&to=12.1 | |||
* Graphical explain explain.depesz.com | |||
* https://www.enterprisedb.com/blog/8-cool-interesting-facts-things-postgresql-can-do | |||
[[Category:DATA]] | |||
[[Category:SQL]] |
Latest revision as of 14:22, 18 September 2023
Postgres cli commands
Wide output ( a la MYSQL's \G )
Good for seeing examples of FEW records:
\x select * from users limit 1;
It's a toggle so do it again on it's own and it gets turned off.
list databases:
\l
List tables:
\dt
databases by size:
\l+
\l+ PGDATABASE
change db after connect with
\c databasename
list all databases
\list or \l
list all tables in the current database
\dt
list relations ships ( tables and sequences )
\d
An App user
Adding a secure , least priviledges user for you app, no root.
GRANT USAGE ON SCHEMA public TO myuser; -- more schemas? GRANT SELECT ON ALL TABLES IN SCHEMA public TO myuser;
This ensure taht all subsequent tables are accessable by that user ( role ):
ALTER DEFAULT PRIVILEGES FOR ROLE mycreating_user IN SCHEMA public GRANT SELECT ON TABLES TO myuser;
ref: https://dba.stackexchange.com/questions/91953/grant-access-to-all-tables-of-a-database
full treatment: https://stackoverflow.com/questions/41537825/restricted-postgresql-permissions-for-web-app/47955583#47955583
User Access Permissions Report
To get a view of user permissions across the whole enchilada for user "myusername":
SELECT u.usename, s.tblname, has_table_privilege(u.usename,s.tblname,'SELECT') AS user_has_select_permission, has_table_privilege(u.usename,s.tblname,'INSERT') AS user_has_insert_permission, has_table_privilege(u.usename,s.tblname,'UPDATE') AS user_has_update_permission, has_table_privilege(u.usename,s.tblname,'DELETE') AS user_has_delete_permission, has_table_privilege(u.usename,s.tblname,'REFERENCES') AS user_has_references_permission FROM pg_user u CROSS JOIN (SELECT DISTINCT schemaname+'.'+tablename as tblname FROM pg_tables where schemaname not in ('pg_catalog', 'public')) s where u.usename = 'myusername' --and user_has_select_permission = false order by 3, 2;
READ only user
-- Create a group
CREATE ROLE readaccess;
-- Grant access to existing tables
GRANT USAGE ON SCHEMA public TO readaccess; GRANT SELECT ON ALL TABLES IN SCHEMA public TO readaccess;
-- Grant access to future tables
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO readaccess;
-- Create a final user with password
CREATE USER tomek WITH PASSWORD 'secret'; GRANT readaccess TO tomek;
Setting Default permissions for a read only user
For subsequent tables with no Schema scope defined:
ALTER DEFAULT PRIVILEGES GRANT SELECT ON TABLES TO myusername;
Changes current tables and not subsequent:
GRANT SELECT ON ALL TABLES IN SCHEMA myschema TO myusername;
Another read only user
postgres=> \du+ pg_monitor
postgres=> create role user_monitor password 'XXX' login; CREATE ROLE postgres=> grant pg_monitor to user_monitor ; GRANT ROLE postgres=> \du+ user_monitor List of roles Role name | Attributes | Member of | Description --------------+------------+--------------+------------- user_monitor | | {pg_monitor} |
Restore and backup
as postgres user
restore:
psql dbname < infile
backup:
/bin/pg_dump dbname > /tmp/dbname.backup.date
Bigger script:
#!/bin/sh -x DBNAME="dbname" DIR=/data/database_backups NOW=`date "+%Y%m%d%H%M%S"` # binaries # stand alone postgres PGDUMP="/bin/pg_dump" PSQL="/bin/psql" PGCONF="/var/lib/pgsql/data/postgresql.conf" echo pg_dump ${PGDUMP} ${DBNAME} | /bin/gzip > ${DIR}/vmdb.pg_dump.${NOW}.gz echo postgresql.conf /bin/cp ${PGCONF} ${DIR}/postgresql.${NOW}.conf echo server vars echo "SHOW ALL" | ${PSQL} > ${DIR}/psql_SHOW_ALL_${NOW}.out
UPDATE 2019 style!
pg_dump -h host -p 5432 -U user -Fd -j 16 -b -v -f "/tmp/filename" db_name
pg_dump -h host # host -p 5432 # port -U user # user -Fd # output format = custom ( fit for pg_restore command ). -j 16 # concurrency control -b # include large objects -v # be verbose -f "/tmp/db_name_dir" db_name
RDS Gochas
Grant the master user to the role you want to grant others ( RDS only )
Adding GIS support
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.html#Appendix.PostgreSQL.CommonDBATasks.PostGIS
pg_stat_statements based reports
https://gist.github.com/NikolayS/b6ec676ea63ab8d4db680a4c0d88e8bf
pg_stat_activity based reports
install this extension pg_stat_statements using
tenant=> create extension pg_stat_statements; CREATE EXTENSION
then query:
SELECT calls, total_time,total_time::decimal/calls avg, query FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10;
per table cache hit ratios
SELECT relname as table , heap_blks_read as heap_read, heap_blks_hit as heap_hit, to_char( ( heap_blks_hit::decimal / ( heap_blks_hit::decimal + heap_blks_read::decimal ) ) * 100 , '99.99' ) FROM pg_statio_user_tables WHERE heap_blks_hit + heap_blks_read > 0 AND relid > 10000 ORDER BY relname;
dont forget to select pg_stat_reset();
SELECT relname as table , heap_blks_read as heap_read, heap_blks_hit as heap_hit, to_char( ( heap_blks_hit::decimal / ( heap_blks_hit::decimal + heap_blks_read::decimal ) ) * 100 , '99.99' ) as ratio FROM pg_statio_user_tables WHERE heap_blks_hit + heap_blks_read > 0 AND relid > 10000 ORDER BY ratio;
percent of time index used
SELECT relname, 100 * idx_scan / (seq_scan + idx_scan) percent_of_times_index_used, n_live_tup rows_in_table FROM pg_stat_user_tables WHERE seq_scan + idx_scan > 0 ORDER BY percent_of_times_index_used ASC;
Process list stuff
SELECT client_addr , count(1) from pg_stat_activity group by client_addr;
postgres 9.5 .9.6, others?
SELECT pid, datname, usename, query_start, state_change, waiting , state from pg_stat_activity ;
SELECT min(query_start) as oldest, max(query_start) as newest, state, count(1) from pg_stat_activity GROUP BY state ;
postgres 11
SELECT pid, usename, client_addr , query_start, backend_start, state_change, wait_event_type, wait_event, state, backend_type , substring( query , 0, 20 ) as query from pg_stat_activity ;
prometheus
SELECT pid, usename, client_addr , query_start, backend_start, state_change, wait_event_type, wait_event, state, backend_type , query from pg_stat_activity where usename = 'prometheus';
select * from pg_stat_bgwriter;
Vacuum
with base as(SELECT round(n_dead_tup / (n_live_tup * current_setting('autovacuum_vacuum_scale_factor')::float8 + current_setting('autovacuum_vacuum_threshold')::float8)::numeric, 3) fraqmentation , schemaname , relname , greatest(last_vacuum, last_autovacuum) AT TIME ZONE 'EDT' last_vacuum , case when coalesce(last_vacuum,'1970-01-01') > coalesce(last_autovacuum,'1970-01-01') then 'Job' when coalesce(last_vacuum,'1970-01-01') < coalesce(last_autovacuum,'1970-01-01') then 'Auto' else null end vaccum_type , greatest(last_analyze, last_autoanalyze) AT TIME ZONE 'EDT' last_analyze , case when coalesce(last_analyze,'1970-01-01') > coalesce(last_autoanalyze,'1970-01-01') then 'Job' when coalesce(last_analyze,'1970-01-01') < coalesce(last_autoanalyze,'1970-01-01') then 'Auto' else null end analyze_type , n_live_tup live_tuple , n_dead_tup dead_tuple , round(n_dead_tup / greatest(n_live_tup,1)::numeric,4) ratio FROM pg_stat_all_tables WHERE schemaname = 'public') select * from base where relname like 'steps_2019_0%' ORDER BY fraqmentation DESC;
( might not owrk with postgres 11 )
Global settings
select * from pg_settings where category like 'Autovacuum';
per table settings
select relname, reloptions, pg_namespace.nspname from pg_class join pg_namespace on pg_namespace.oid = pg_class.relnamespace where pg_namespace.nspname = 'public'; select relname, substring(reloptions::varchar,0,70) as options, pg_namespace.nspname from pg_class join pg_namespace on pg_namespace.oid = pg_class.relnamespace where pg_namespace.nspname = 'public' ORDER BY reloptions;
what vaccum is running now ?
possibly old version:
select datname, usename, pid, state, wait_event, current_timestamp - xact_start AS xact_runtime, query from pg_stat_activity where upper(query) LIKE '%VACUUM%' and pid != pg_backend_pid() order by xact_start;
new ?
select datname, usename, pid, state, waiting, current_timestamp - xact_start AS xact_runtime, query from pg_stat_activity where upper(query) LIKE '%VACUUM%' and pid != pg_backend_pid() order by xact_start;
pg 11:
select * from pg_stat_progress_vacuum;
current vaccuum config
SELECT relname, reltuples , reloptions FROM pg_class ORDER BY reloptions ASC ,relname;
doesn't work on pg 11
disk usage
SELECT relname AS "table_name", pg_size_pretty(pg_table_size(C.oid)) AS "table_size" FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND nspname !~ '^pg_toast' AND relkind IN ('r') ORDER BY pg_table_size(C.oid) DESC LIMIT 20;
stats
SELECT relname, last_vacuum, last_autovacuum, vacuum_count, autovacuum_count FROM pg_stat_user_tables ORDER BY last_autovacuum ASC;
SELECT relname, last_vacuum, last_autovacuum, vacuum_count, autovacuum_count FROM pg_stat_user_tables ORDER BY last_autovacuum DESC;
also see: https://www.datadoghq.com/blog/postgresql-vacuum-monitoring/
can also use "pg_stat_all_tables"
also see: https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-ALL-TABLES-VIEW
possibly also from the log: grep -A 5 -i vacuum postgresql-201*.log| awk 'BEGIN{RS="--"}{gsub("\n"," ",$0); print $0 }
transacion id wrap around
vacccum might happen becasue of this:
reference; https://www.cybertec-postgresql.com/en/autovacuum-wraparound-protection-in-postgresql/
SELECT oid::regclass::text AS table, age(relfrozenxid) AS xid_age, mxid_age(relminmxid) AS mxid_age, least( (SELECT setting::int FROM pg_settings WHERE name = 'autovacuum_freeze_max_age') - age(relfrozenxid), (SELECT setting::int FROM pg_settings WHERE name = 'autovacuum_multixact_freeze_max_age') - mxid_age(relminmxid) ) AS tx_before_wraparound_vacuum, pg_size_pretty(pg_total_relation_size(oid)) AS size, pg_stat_get_last_autovacuum_time(oid) AS last_autovacuum FROM pg_class WHERE relfrozenxid != 0 AND oid > 16384 ORDER BY tx_before_wraparound_vacuum;
Datadog Dashboard
This is a great dashboard:
- service latency ( the front side view ) not really post gres but the thing you don't want impacted by activity in the db.
- Locks by table
- Dead Row by Table
- Lock by Mode
- Table Size by Table
- Live rows
- IOPS
- CPU Utiliation
Replication
RDS replication
permissions get replicated.
create a user on the master , wait a bit and the reader will have those same permissions.
Performance
You should do this for your queries:
EXPLAIN (ANALYSE,VERBOSE,BUFFERS,JSON) SELECT * FROM fake_data LIMIT 10 OFFSET 200
It can highlight issues.
wrong default_statistics_target ?
wrong work_mem ?
Join and sub queries cost the same now
EXPLAIN (ANALYZE, BUFFERS FALSE, TIMING FALSE, SUMMARY FALSE) SELECT titles_translations.* FROM titles JOIN titles_translations USING(title_id) WHERE titles.type = 'movie' AND titles.primary_name = 'How to Train Your Dragon'; EXPLAIN (ANALYZE, BUFFERS FALSE, TIMING FALSE, SUMMARY FALSE) SELECT * FROM titles_translations WHERE title_id IN( SELECT title_id FROM titles WHERE type = 'movie' AND primary_name = 'How to Train Your Dragon' ); EXPLAIN (ANALYZE, BUFFERS FlaISE, TIMING FALSE, SUMMARY FALSE) SELECT * FROM titles_transliations WHERE EXISTS( SELECT * FROM titles WHERE title_id = titles_translations.title_id AND type = 'movie' AND primary_name = 'How to Train Your Dragon' );
Dead tuples report
SELECT relname, n_dead_tup FROM pg_stat_user_tables order by n_dead_tup desc limit 30;
would there be a vacuum ?
SELECT * FROM pg_stat_all_tables ORDER BY n_dead_tup / (n_live_tup * current_setting('autovacuum_vacuum_scale_factor')::float8 + current_setting('autovacuum_vacuum_threshold')::float8) DESC LIMIT 10;
Lock report
tip: how to exclude your own query:
WHERE a.pid != pg_backend_pid()
SELECT a.datname, l.relation::regclass, l.mode, l.GRANTED, a.usename, substring(a.query,0,80) as query, a.query_start, age(now(), a.query_start) AS "age", a.pid FROM pg_stat_activity a JOIN pg_locks l ON l.pid = a.pid WHERE a.pid != pg_backend_pid() ORDER BY a.query_start;
long running locking queries:
select pid, state, usename, query, query_start, now() - query_start as elapse from pg_stat_activity where pid in ( select pid from pg_locks l join pg_class t on l.relation = t.oid where t.relkind = 'r' AND database = 5302651 ) ORDER BY elapse desc LIMIT 1;
useful walk through of locking:
https://jaketrent.com/post/find-kill-locks-postgres
Disk usage
By table:
SELECT relname as "Table", pg_size_pretty(pg_total_relation_size(relid)) As "Size", pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as "External Size" FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC;
https://wiki.postgresql.org/wiki/Disk_Usage
SELECT *, pg_size_pretty(total_bytes) AS total , pg_size_pretty(index_bytes) AS INDEX , pg_size_pretty(toast_bytes) AS toast , pg_size_pretty(table_bytes) AS TABLE FROM ( SELECT *, total_bytes-index_bytes-COALESCE(toast_bytes,0) AS table_bytes FROM ( SELECT c.oid,nspname AS table_schema, relname AS TABLE_NAME , c.reltuples AS row_estimate , pg_total_relation_size(c.oid) AS total_bytes , pg_indexes_size(c.oid) AS index_bytes , pg_total_relation_size(reltoastrelid) AS toast_bytes FROM pg_class c LEFT JOIN pg_namespace n ON n.oid = c.relnamespace WHERE relkind = 'r' ) a ) a;
SELECT nspname || '.' || relname AS "relation", pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size" FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND C.relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size(C.oid) DESC LIMIT 5; <pre> == cache hit ration summary == <pre> SELECT sum(heap_blks_read) as heap_read, sum(heap_blks_hit) as heap_hit, sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio FROM pg_statio_user_tables;
best to reset counters some where in there.
pg_stat_reset()
Enable Extensions
In AWS RDS Postgres check that the extention you want is already in the param group.
Name: rds.extensions Values: ... big long string...
then check the system var:
SHOW rds.extensions;
Then enable it:
CREATE EXTENSION IF NOT EXISTS tablefunc;
Tablefunc gives you the awesome crosstabN() funtion.
References:
- https://www.postgresql.org/docs/9.1/sql-createextension.html
- https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html
- https://www.postgresql.org/docs/9.6/tablefunc.html#AEN190375 ( crosstabN() )
Pivot table in postgres?
use crosstabN ...
Random data
Generate
reference:
Random date:
SELECT date_trunc('day', date '2014-01-10' + random() * interval '3 days' )::timestamp::date as day, seq, u.id AS user_id FROM GENERATE_SERIES(1, 50) seq, users u;
SELECT TO_CHAR(day, 'YYYY-MM-DD'), random() FROM generate_series ( '2017-02-01'::date , '2017-04-01'::date , '1 day'::interval) day
SELECT generate_series(1,10) AS id, md5(random()::text) AS descr;
Select
examine the shape of your ID:
SELECT count(*) AS ct -- optional , min(id) AS min_id , max(id) AS max_id , max(id) - min(id) AS id_span FROM mytable ;
let us suppose that your span is 5100000 and that you distribution of id is dense
WITH params AS ( SELECT 1 AS min_id -- minimum id <= current min id , 5100000 AS id_span -- rounded up. (max_id - min_id + buffer) ) SELECT * FROM ( SELECT p.min_id + trunc(random() * p.id_span)::integer AS id FROM params p ,generate_series(1, 1100) g -- 1000 + buffer GROUP BY 1 -- trim duplicates ) r JOIN big USING (id) LIMIT 1000;
Also See
- Sql of note
- https://gist.github.com/NikolayS/b6ec676ea63ab8d4db680a4c0d88e8bf
- The most useful Postgres extension: pg_stat_statements
- Slowest query et al
- https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT
- Why upgrade postgres https://why-upgrade.depesz.com/show?from=12&to=12.1
- Graphical explain explain.depesz.com
- https://www.enterprisedb.com/blog/8-cool-interesting-facts-things-postgresql-can-do