postgres streaming replication 搭建

1

Posted on by

环境:

主:192.168.1.11
备:192.168.1.12
软件安装
主库安装

 

主库创建流复制用户:

CREATE USER repuser
  REPLICATION 
  LOGIN
   PASSWORD 'repuser';

 

设置 pg_hba.conf,添加以下

host   replication     repuser          0.0.0.0/0                md5

 

设置 postgresql.conf,添加以下

hot_standby = on
max_wal_senders = 4
wal_level = hot_standby
archive_mode = on 
#archive_command = '/bin/date'   # (change requires restart) 不归档
archive_command ='cp -i %p /u01/pg_archivelog/%f 

 

主库重启一下:

pg_ctl stop
pg_ctl start

 

在备库上同步主库数据

[postgres@oradbca ~]$ pg_basebackup -D /u01/pgdata -Fp -Xs -v -P -h 192.168.1.11 -p 5432 -U repuser
transaction log start point: 0/1A000028 on timeline 1
pg_basebackup: starting background WAL receiver
37929/37929 kB (100%), 2/2 tablespaces                                         
transaction log end point: 0/1A0000F0
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: base backup completed

 

在Slave端修改postgresql.conf文件

hot_standby = on

 

cp $PGHOME/share/recovery.conf.sample $PGDATA/recovery.conf

vi recovery.conf  --新增以下四行
standby_mode = 'on'
trigger_file = '/u01/pgdata/postgresql.trigger.file'
primary_conninfo = 'host=192.168.1.11 port=5432 user=repuser password=repuser  keepalives_idle=60 keepalives_interval=5 keepalives_count=5'
recovery_target_timeline = 'latest'

 

启动备库

pg_ctl start

 

后台日志输出内容为:

LOG:  entering standby mode
LOG:  redo starts at 0/1A000028
LOG:  consistent recovery state reached at 0/1B000000
LOG:  started streaming WAL from primary at 0/1B000000 on timeline 1

 

[postgres@oradbca ~]$ pg_controldata  |grep state
Database cluster state:               in archive recovery

说明已经搭建成功了。

postgresql安装

0

Posted on by

安装一些系统包:

yum install -y gcc gcc-c++ automake autoconf libtool make gmake
yum install -y readline-devel
yum install -y zlib-devel

 

创建用户名:

groupadd postgres
useradd -g postgres -m postgres
passwd postgres

 

更新bash_profile

vi .bash_profile

export PGPORT=5432
export PGHOME=/u01/pgsql
export PGDATA=/u01/pgdata
export PATH=$PGHOME/bin:$PATH
export MANPATH=$PGHOME/share/man:$MANPATH
export LANG=en_US.utf8
export DATE=`date +"%Y-%m-%d %H:%M:%S"`
export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH

 

编译安装

tar -zvxf postgresql-9.3.1.tar.gz
./configure --prefix=/u01/pgsql
make &&  make install

 

配置选项说明:

--prefix=/u01/pgsql:安装所有文件在/u01/pgsql中(取代默认时的/usr/local/pgsql)。
--with-pgport=5432:为服务器和客户端设置默认端口号。默认是5432。
--with-perl:编译PL/Perl服务端语言。
--with-python:编译PL/Python服务端语言。
--with-tcl:编译PL/Tcl服务端语言。
--with-openssl:编译支持SLL(加密)连接。这需要安装OpenSSL包。
--with-pam:编译支持PAM(Pluggable Authentication Modules,可插拔认证模块)
--without-ldap:编译支持认证和连接参数检查
--with-libxml:编译libxml(支持SQL/XML),支持这个选项需要Libxml 2.6.23及最新版
--with-libxslt:编译xml2模块,使用libxslt
--enable-thread-safety:让客户端库是线程安全的
--with-wal-blocksize=16:WAL:预写式日志(Write-Ahead Logging)
设置WAL的block size,以MB为单位。这是在WAL日志中的每个独立文件的大小。为了控制WAL日志传送的粒度去调整其大小,这可能是非常有用的

。默认为16MB。这个值必须是2的1到64次方(MB)。注意,改变这个值需要一个initdb。
--with-blocksize=16:设置block size,以KB为单位。这是表的存储和IO单元。默认为8K,适用于大多数情况;但是在特殊场合中,其他的值可能

是非常有用的。这个值必须是2的1到32次方(KB)。注意,改变这个值需要一个initdb。
--enable-dtrace:编译PostgreSQL支持动态跟踪工具DTrace
--enable-debug:把所有程序和库以带有调试符号的方式编译
--enable-nls[=LANGUAGES]:打开本地语言支持(NLS),即以非英文显示程序的信息的能力。LANGUAGES是一个空格分隔的语言代码列表,标识你想

支持的语言。比如--enable-nls='de fr'。(你提供的列表和实际支持的列表之间的交集将会自动计算出来。)如果你没有声明一个列表,那么就

安装所有可用的翻译。

 

加入自启动文件

cp contrib/start-scripts/linux /etc/init.d/postgresql
chmod 755 /etc/init.d/postgresql 

chkconfig --add postgresql

更改配置文件
vi /etc/init.d/postgresql
prefix=/u01/pgsql

# Data directory
PGDATA="/u01/pgdata"

 

初始化数据库

initdb -D /u01/pgdata --locale=en_US.UTF8


配置/u01/pgdata/pg_hba.conf
 

host    all             all             0.0.0.0/0                md5
host    replication     postgres        0.0.0.0/0                md5

 

配置/u01/pgdata/postgresql.conf

listen_addresses = '*'

 

修改postgres 密码

[postgres@oradbca ~]$ psql
psql (9.3.4)
Type "help" for help.

postgres=# alter user postgres with password 'postgres';
ALTER ROLE
postgres=# 

 

 

postgresql数据库文件布局

0

Posted on by

 

         数据库集群所需要的所有数据都存储在集群的数据目录里, 通常用 PGDATA 来引用(用的是可以定义之的环境变量的名字)。 PGDATA 的一个常见位置是 /var/lib/pgsql/data。 

        PGDATA 目录包含几个子目录以及一些控制文件,除了这些必要的东西之外, 集群的配置文件 postgresql.conf,pg_hba.conf 和 pg_ident.conf 通常都存储在 PGDATA 。

 

 

PGDATA的内容

描述
PG_VERSION 一个包含 PostgreSQL 主版本号的文件
base 包含每个数据库对应的子目录的子目录
global 包含集群范围的表的子目录,比如pg_database
pg_clog 包含事务提交状态数据的子目录
pg_multixact 包含多重事务(multi-transaction)状态数据的子目录(用于共享的行锁)
pg_subtrans 包含子事务状态数据的子目录
pg_tblspc 包含指向表空间的符号链接的子目录
pg_twophase 包含用于准备好事务状态文件的子目录
pg_xlog 包含 WAL (预写日志)文件的子目录
postmaster.opts 一个记录 postmaster 最后一次启动时使用的命令行参数的文件
postmaster.pid 一个锁文件,记录着当前的 postmaster PID 和共享内存段 ID (在 postmaster 关闭之后不存在)

         对于集群里的每个数据库,在 PGDATA/base 里都有一个子目录对应, 子目录的名字时该数据库在 pg_database 里的 OID。 这个子目录时该数据库文件的缺省位置;特别值得一提的是,该数据库的系统表存储在此。

        每个表和索引都存储在独立的文件里,以该表或者该索引的 filenode 号命名, 我们可以在 pg_class.relfilenode 找到。

       请注意,虽然一个表的 filenode 通常和它的 OID 相同,但实际上并不必须如此; 有些操作,比如 TRUNCATEREINDEXCLUSTER 以及一些特殊的 ALTER TABLE 形式,都可以改变 filenode 而同时保留 OID。 我们不应该假设 filenode 和表 OID 相同。

 

       在表或者索引超过 1Gb 之后,它就被分裂成一吉大小的。 第一个段的文件名和 filenode 相同;随后的段名名为 filenode.1,filenode.2,等等。 这样的安排避免了在某些有文件大小限制的平台上的问题。

       一个表如果有些字段里面可能存储相当大的数据,那么就会有个相关联的 TOAST 表, 用于存储无法在表的数据行中放下的太大的线外数据。 如果有的话,pg_class.reltoastrelid 从一个表链接到它的 TOAST 表。

postgresql内存说明

0

Posted on by

 
postgresql的内存说明:
 
 
shared_buffers (integer)

Sets the amount of memory the database server uses for shared memory buffers. The default is typically 128 megabytes (128MB), but might be less if your kernel settings will not support it (as determined duringinitdb). This setting must be at least 128 kilobytes. (Non-default values of BLCKSZ change the minimum.) However, settings significantly higher than the minimum are usually needed for good performance. This parameter can only be set at server start.

If you have a dedicated database server with 1GB or more of RAM, a reasonable starting value for shared_buffers is 25% of the memory in your system. There are some workloads where even large settings forshared_buffers are effective, but because PostgreSQL also relies on the operating system cache, it is unlikely that an allocation of more than 40% of RAM to shared_buffers will work better than a smaller amount. Larger settings for shared_buffers usually require a corresponding increase in checkpoint_segments, in order to spread out the process of writing large quantities of new or changed data over a longer period of time.

On systems with less than 1GB of RAM, a smaller percentage of RAM is appropriate, so as to leave adequate space for the operating system. Also, on Windows, large values for shared_buffers aren't as effective. You may find better results keeping the setting relatively low and using the operating system cache more instead. The useful range for shared_buffers on Windows systems is generally from 64MB to 512MB.

temp_buffers (integer)

Sets the maximum number of temporary buffers used by each database session. These are session-local buffers used only for access to temporary tables. The default is eight megabytes (8MB). The setting can be changed within individual sessions, but only before the first use of temporary tables within the session; subsequent attempts to change the value will have no effect on that session.

A session will allocate temporary buffers as needed up to the limit given by temp_buffers. The cost of setting a large value in sessions that do not actually need many temporary buffers is only a buffer descriptor, or about 64 bytes, per increment in temp_buffers. However if a buffer is actually used an additional 8192 bytes will be consumed for it (or in general, BLCKSZ bytes).

max_prepared_transactions (integer)

Sets the maximum number of transactions that can be in the "prepared" state simultaneously (see PREPARE TRANSACTION). Setting this parameter to zero (which is the default) disables the prepared-transaction feature. This parameter can only be set at server start.

If you are not planning to use prepared transactions, this parameter should be set to zero to prevent accidental creation of prepared transactions. If you are using prepared transactions, you will probably wantmax_prepared_transactions to be at least as large as max_connections, so that every session can have a prepared transaction pending.

When running a standby server, you must set this parameter to the same or higher value than on the master server. Otherwise, queries will not be allowed in the standby server.

work_mem (integer)

Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files. The value defaults to one megabyte (1MB). Note that for a complex query, several sort or hash operations might be running in parallel; each operation will be allowed to use as much memory as this value specifies before it starts to write data into temporary files. Also, several running sessions could be doing such operations concurrently. Therefore, the total memory used could be many times the value of work_mem; it is necessary to keep this fact in mind when choosing the value. Sort operations are used for ORDER BYDISTINCT, and merge joins. Hash tables are used in hash joins, hash-based aggregation, and hash-based processing of IN subqueries.

maintenance_work_mem (integer)

Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUMCREATE INDEX, and ALTER TABLE ADD FOREIGN KEY. It defaults to 16 megabytes (16MB). Since only one of these operations can be executed at a time by a database session, and an installation normally doesn't have many of them running concurrently, it's safe to set this value significantly larger than work_mem. Larger settings might improve performance for vacuuming and for restoring database dumps.

Note that when autovacuum runs, up to autovacuum_max_workers times this memory may be allocated, so be careful not to set the default value too high.

max_stack_depth (integer)

Specifies the maximum safe depth of the server's execution stack. The ideal setting for this parameter is the actual stack size limit enforced by the kernel (as set by ulimit -s or local equivalent), less a safety margin of a megabyte or so. The safety margin is needed because the stack depth is not checked in every routine in the server, but only in key potentially-recursive routines such as expression evaluation. The default setting is two megabytes (2MB), which is conservatively small and unlikely to risk crashes. However, it might be too small to allow execution of complex functions. Only superusers can change this setting.

Setting max_stack_depth higher than the actual kernel limit will mean that a runaway recursive function can crash an individual backend process. On platforms where PostgreSQL can determine the kernel limit, the server will not allow this variable to be set to an unsafe value. However, not all platforms provide the information, so caution is recommended in selecting a value.

 

ltfs on tape

0

Posted on by

  前几天买了磁带机放数据库备份,原来的格式难于管理,把磁带格式成文件系统来管理。

  LTFS(线性磁带文件系统)是为线性磁带开放(LTO)存储技术添加索引的文件系统规范。

  LTFS将LTO-5磁带分为两个部分,即两个分区。分区0存储目录结构和指针,分区1存储数据本身,这样可以使磁带驱动器快速定位所需数据。在磁带上应用文件系统使得用户能够像硬盘一样组织和查询磁带内容,改善磁带上的数据访问时间。LTFS技术能够使用户往磁带上拖放文件变得像在磁盘上一样简单。

  IBM在2010年推出了LTFS格式,能够支持磁带归档需求。随后惠普公司的LTO联盟、IBM和希捷公司正式采用了这个LTFS格式规范,该规范定义了磁带上的数据和元数据是如何存储在一个分层目录结构上的。用户在下载该软件之前要确保操作系统能够识别LTFS。

LTFS安装

1. RPM安装

[root@db-backup ~]# yum install rhel6-qtmltfs-2.1.1-1.x86_64.rpm

 

2. 创建LTFS文件系统

[root@db-backup ~]# mkltfs -d /dev/st0

LTFS15000I Starting mkltfs, QUANTUMLTFS Standalone version 2.1.1, log level 2

LTFS15041I Launched by "mkltfs -d /dev/st0"

LTFS15042I This binary is built for Linux (x86_64)

LTFS15043I GCC version is 4.4.7 20120313 (Red Hat 4.4.7-4)

LTFS17087I Kernel version: Linux version 2.6.32-279.el6.x86_64 (mockbuild@c6b9.bsys.dev.centos.org) (gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC) ) #1 SMP Fri Jun 22 12:19:21 UTC 2012 x86_64

LTFS17089I Distribution: CentOS release 6.3 (Final)

LTFS17089I Distribution: CentOS release 6.3 (Final)

LTFS17089I Distribution: CentOS release 6.3 (Final)

LTFS15003I Formatting device '/dev/st0'

LTFS15004I LTFS volume blocksize: 524288

LTFS15005I Index partition placement policy: None



LTFS17085I Plugin: Loading "ltotape" driver

LTFS20013I Drive type is ULTRIUM-HH6, serial number is 1068024823, firmware revision is D8E5

LTFS17160I Maximum device block size is 1048576

LTFS15049I Checking the medium

LTFS15010I Creating data partition b on SCSI partition 1

LTFS15011I Creating index partition a on SCSI partition 0

LTFS17165I Resetting the medium's capacity proportion

LTFS11097I Partitioning the medium

LTFS11100I Writing label to partition b

LTFS11278I Writing index to partition b

LTFS11100I Writing label to partition a

LTFS11278I Writing index to partition a

LTFS15013I Volume UUID is: 5d12d3c7-6667-4fd3-aea0-c6312f1f0cef



LTFS15019I Volume capacity is 2408 GB

LTFS15024I Medium formatted successfully

3. 检查LTFS文件系统

[root@db-backup ~]# ltfsck  /dev/nst0

LTFS16000I Starting ltfsck, QUANTUMLTFS Standalone version 2.1.1, log level 2

LTFS16088I Launched by "ltfsck /dev/nst0"

LTFS16089I This binary is built for Linux (x86_64)

LTFS16090I GCC version is 4.4.7 20120313 (Red Hat 4.4.7-4)

LTFS17087I Kernel version: Linux version 2.6.32-279.el6.x86_64 (mockbuild@c6b9.bsys.dev.centos.org) (gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC) ) #1 SMP Fri Jun 22 12:19:21 UTC 2012 x86_64

LTFS17089I Distribution: CentOS release 6.3 (Final)

LTFS17089I Distribution: CentOS release 6.3 (Final)

LTFS17089I Distribution: CentOS release 6.3 (Final)

LTFS17085I Plugin: Loading "ltotape" driver

LTFS20013I Drive type is ULTRIUM-HH6, serial number is 1068024823, firmware revision is D8E5

LTFS17160I Maximum device block size is 1048576

LTFS16014I Checking LTFS file system on '/dev/nst0'

LTFS16023I LTFS volume information:

LTFS16024I Volser (bar code) :      

LTFS16025I Volume UUID     : cd2aea33-abd8-475a-9500-62164c194dc3

LTFS16026I Format time     : 2014-05-16 15:05:49.637843234 CST

LTFS16027I Block size      : 524288

LTFS16028I Compression     : Enabled

LTFS16029I Index partition : ID = a, SCSI Partition = 0

LTFS16030I Data partition  : ID = b, SCSI Partition = 1



LTFS11005I Mounting the volume

LTFS11026I Performing a full medium consistency check

LTFS11233I Updating MAM coherency data

LTFS10023I LTFS volume information:

LTFS10031I Volume Name     : LTFS VOLUME

LTFS10024I Volser(Barcode) :      

LTFS10025I Volume UUID     : cd2aea33-abd8-475a-9500-62164c194dc3

LTFS10026I Format Time     : 2014-05-16 15:05:49.637843234 CST

LTFS10027I Block Size      : 524288

LTFS10028I Compression     : Enabled

LTFS10029I Index Partition : ID = a, SCSI Partition = 0, Total Capacity = 35060 MiB, Available Space = 35053 MiB

LTFS10030I Data Partition  : ID = b, SCSI Partition = 1, Total Capacity = 2296532 MiB, Available Space = 2296532 MiB



LTFS11034I Volume unmounted successfully

LTFS16022I Volume is consistent





4. MOUNT LTFS文件系统

[root@db-backup ~]#mkdir /tape_filesystem/

[root@db-backup ~]# ltfs /tape_filesystem/

LTFS14000I Starting ltfs, QUANTUMLTFS Standalone version 2.1.1, log level 2

LTFS14058I LTFS Format Specification version 2.1.0

LTFS14104I Launched by "ltfs /tape_filesystem/"

LTFS14105I This binary is built for Linux (x86_64)

LTFS14106I GCC version is 4.4.7 20120313 (Red Hat 4.4.7-4)

LTFS17087I Kernel version: Linux version 2.6.32-279.el6.x86_64 (mockbuild@c6b9.bsys.dev.centos.org) (gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC) ) #1 SMP Fri Jun 22 12:19:21 UTC 2012 x86_64

LTFS17089I Distribution: CentOS release 6.3 (Final)

LTFS17089I Distribution: CentOS release 6.3 (Final)

LTFS17089I Distribution: CentOS release 6.3 (Final)

LTFS14063I Sync type is "time", Sync time is 300 sec

LTFS17085I Plugin: Loading "ltotape" driver

LTFS17085I Plugin: Loading "unified" iosched

LTFS20013I Drive type is ULTRIUM-HH6, serial number is 1068024823, firmware revision is D8E5

LTFS17160I Maximum device block size is 1048576

LTFS11005I Mounting the volume

LTFS10023I LTFS volume information:

LTFS10031I Volume Name     : LTFS VOLUME

LTFS10024I Volser(Barcode) :      

LTFS10025I Volume UUID     : cd2aea33-abd8-475a-9500-62164c194dc3

LTFS10026I Format Time     : 2014-05-16 15:05:49.637843234 CST

LTFS10027I Block Size      : 524288

LTFS10028I Compression     : Enabled

LTFS10029I Index Partition : ID = a, SCSI Partition = 0, Total Capacity = 35060 MiB, Available Space = 35053 MiB

LTFS10030I Data Partition  : ID = b, SCSI Partition = 1, Total Capacity = 2296532 MiB, Available Space = 2296532 MiB



LTFS14111I Initial setup completed successfully

LTFS14112I Invoke 'mount' command to check the result of final setup

LTFS14113I Specified mount point is listed if succeeded

[root@db-backup ~]# df -h

Filesystem            Size  Used Avail Use% Mounted on

/dev/mapper/vg_root-LogVol00

                      909G  647G  216G  75% /

tmpfs                  16G     0   16G   0% /dev/shm

/dev/sda1             194M   27M  158M  15% /boot

ltfs:/dev/nst0        2.2T     0  2.2T   0% /tape_filesystem

成功挂载,可以像文件系统一样来管理磁带机了。