plproxy的调用流程

plproxy能够在Postgresql上运行的一种过程语言，能够完成对远程数据库的调用，并能够完成数据切片的功能。数据流处理过程如下图所示：

首先需要明确的是plproxy只能对用户自定的方法才有小，如果想达到对sql语句的无条件转发，plproxy是做不到的。比如希望所有的select * from tablename 都转发到cluster1上，所有的update语句都转发到cluster2上，是不能通过plproxy做到的。
plproxy能做到的是：在plproxy cluster上定义foo()函数，在cluster1和cluster2上也定义foo()函数，cluster1和cluster2上的foo()函数的定义是完全相同的，包括函数参数都是相同的；二者和plproxy cluster上的foo()定义均不同。

1. plproxy的关键函数

a. plproxy.get_cluster_version(cluster_name) 举例如下：

@H_403_15@

CREATEORREPLACEFUNCTIONplproxy.get_cluster_version(cluster_nametext)

RETURNSint4AS$$

BEGIN

IFcluster_name='a_cluster'THEN

RETURN1;

ENDIF;

RAISEEXCEPTION'Unknowncluster';

END;

$$LANGUAGEplpgsql;

该函数返回的是当前plproxy的配置版本号，每当有请求时此函数都会调用。如果该函数的返回值和plproxy的缓存的配置版本号不同，则plproxy会认为配置已经更新，就会调用下面两个函数重新读取配置。

b. plproxy.get_cluster_partitions(cluster_name text) 举例如下：

@H_403_15@

CREATEORREPLACEFUNCTIONplproxy.get_cluster_partitions(cluster_nametext)

RETURNSSETOFtextAS$$

BEGIN

IFcluster_name='a_cluster'THEN

RETURNNEXT'dbname=part00host=127.0.0.1';

RETURNNEXT'dbname=part01host=127.0.0.1';

RETURN;

ENDIF;

RAISEEXCEPTION'Unknowncluster';

END;

$$LANGUAGEplpgsql;

该函数返回plproxy所设置的内容，当所访问的cluster名称为a_cluster时，返回RETURN NEXT所对应的远程主机配置。

c. plproxy.get_cluster_config(cluster) 举例如下：

@H_403_15@

CREATEORREPLACEFUNCTIONplproxy.get_cluster_config(

incluster_nametext,

outkeytext,

outvaltext)

RETURNSSETOFrecordAS$$

BEGIN

--letsusesameconfigforallclusters

key:='connection_lifetime';

val:=30*60;--30m

RETURNNEXT;

RETURN;

END;

$$LANGUAGEplpgsql;

该函数返回plproxy与cluser1和cluster2链接时的一些参数，如生命周期，超时时间，是否使用binary IO等。

2. foo()函数的定义：

在plproxy cluster上定义foo的内容如下：

@H_403_15@

CREATEORREPLACEFUNCTIONfoo(i_usernametext)

RETURNStextAS

$BODY$

CLUSTER'a_cluster';RUNONhashtext(i_username) & 1;

$BODY$

LANGUAGE'plproxy'VOLATILE

COST 100;

在cluster1和cluster2上定义foo的内容如下：

@H_403_15@

CREATEORREPLACEFUNCTIONfoo(i_usernametext)

RETURNStextAS

$BODY$

BEGIN

RETURN'useralreadyexists';

END;

$BODY$

LANGUAGE'plpgsql'VOLATILESECURITYDEFINER

COST100;

可以看到，在plproxy上定义的foo()函数的language是“plproxy”，在cluster1和2上的foo()函数定义的language是“plpgsql”，这是最主要的区别。

3. 调用过程

a. 客户端发送select foo()的请求到plproxy cluster;
b. plproxy cluster发现foo()是用户自定义的函数，查找到该函数定义内容发现该函数是通过plproxy language定义的
c. plproxy cluster把控制权转交到plproxy language的handler，
d. 该handler执行foo()的内容，发现该函数需要使用名为"a_cluster"的配置
e. handler调用 plproxy.get_cluster_version("a_cluster"),发现配置版本号相符；
则继续调用 plproxy.get_cluster_partitions("a_cluster")获得四项远程cluster的链接配置；
然后继续调用 plproxy.get_cluster_config( in cluster_nametext, out keytext, out valtext)获得链接时的配
置信息
f. 最后根据 RUNONhashtext(i_username) & 1 把"select foo()"请求转发到cluster1或者2上。
hashtext(i_username) & 2 会返回一个0-1之间的值，根据这个值确定应该转发请求到哪个具体的cluster上.
g. 在cluster上执行"select foo()",把结果" useralreadyexists"返回给plproxy cluster
h. plproxy cluster再把收到的结果返回给client.

4. plproxy的局限
plproxy并不能完成无条件的转发，只能在自定义的函数上实现此功能。这就要求在实际的应用中，需要把大量的业务逻辑放到Postgresql服务器端来完成，降低了灵活度。

一点体会：使用Postgresql，需要改变思路，需要适应把业务逻辑通过服务器端编程实现的过程。这是与MysqL的很大不同。

plproxy官方链接

例子： PostgreSQL cluster: partitioning with plproxy (part I)
PostgreSQL cluster: partitioning with plproxy (part II)

plproxy的调用流程

猜你在找的Postgre SQL相关文章