如何使用dynamicSQL设置复合variables字段的值

鉴于这种types:

-- Just for testing purposes: CREATE TYPE testType as (name text) 

我可以用这个函数dynamic地获得一个字段的值:

 CREATE OR REPLACE FUNCTION get_field(object anyelement, field text) RETURNS text as $BODY$ DECLARE value text; BEGIN EXECUTE 'SELECT $1."' || field || '"' USING object INTO value; return value; END; $BODY$ LANGUAGE plpgsql 

调用get_field('(david)'::testType, 'name')按照预期返回“david”。

但是,我怎样才能设置复合types的字段的值? 我试过这些function:

 CREATE OR REPLACE FUNCTION set_field_try1(object anyelement, field text, value text) RETURNS anyelement as $BODY$ DECLARE value text; BEGIN EXECUTE '$1."' || field || '" := $2' USING object, value; return object; END; $BODY$ LANGUAGE plpgsql CREATE OR REPLACE FUNCTION set_field_try2(object anyelement, field text, value text) RETURNS anyelement as $BODY$ DECLARE value text; BEGIN EXECUTE 'SELECT $1 INTO $2."' || field || '"' USING value, object; return object; END; $BODY$ LANGUAGE plpgsql CREATE OR REPLACE FUNCTION set_field_try3(object anyelement, field text, value text) RETURNS anyelement as $BODY$ DECLARE value text; BEGIN EXECUTE 'BEGIN $1."' || field || '" := $2; SELECT $1; END;' INTO object USING value, object; return object; END; $BODY$ LANGUAGE plpgsql 

和一些变化。 调用set_field_tryX不起作用。 我总是得到“错误:语法错误在或接近…”。 我怎样才能做到这一点?

笔记:

  • 参数是anyelement ,字段可以是复合types中的任何字段。 我不能只使用object.name。
  • 我很关心SQL注入。 任何意见,将不胜感激,但这不是我的问题。

hstore更快

由于Postgres 9.0 ,与数据库中安装额外的模块hstore有一个非常简单和快速的解决scheme#=运算符 …

hstore匹配值replacerecord [s]字段。

要安装模块:

 CREATE EXTENSION hstore; 

例子:

 SELECT my_record #= '"field"=>"value"'::hstore; -- with string literal SELECT my_record #= hstore(field, value); -- with values 

显然,价值观必须转化为text和背景。

示例plpgsql函数具有更多详细信息:

  • 触发function无尽的循环
  • 通过Postgres触发器中的键分配给NEW

几乎与json一样快

在Postgres中内置了json (pg 9.3+)或jsonb (pg 9.4+)的解决scheme,但是目前还没有提供相关的解决scheme,因此您不需要额外的模块。

详情请参阅@ Geir的添加答案。

没有hstorejson

如果您使用的是旧版本,或者无法安装额外的hstore模块,或者无法安装,那么以下是我之前发布的改进版本。 尽pipe如此,仍然比hstore运营商慢:

 CREATE OR REPLACE FUNCTION f_setfield(INOUT _comp_val anyelement , _field text, _val text) RETURNS anyelement AS $func$ BEGIN EXECUTE 'SELECT ' || array_to_string(ARRAY( SELECT CASE WHEN attname = _field THEN '$2' ELSE '($1).' || quote_ident(attname) END AS fld FROM pg_catalog.pg_attribute WHERE attrelid = pg_typeof(_comp_val)::text::regclass AND attnum > 0 AND attisdropped = FALSE ORDER BY attnum ), ',') USING _comp_val, _val INTO _comp_val; END $func$ LANGUAGE plpgsql STABLE; 

呼叫:

 CREATE TEMP TABLE t( a int, b text); -- Composite type for testing SELECT f_setfield(NULL::t, 'a', '1'); 

笔记

  • _val显式转换为目标数据types不是必需的,dynamic查询中的string文字将被自动强制pg_type ,从而避免pg_type上的子查询。 但是我更进了一步:

  • 通过USING子句将quote_literal(_val)replace为直接值插入。 保存一个函数调用和两个casts,反正更安全。 在现代PostgreSQL中, text被自动强制转换为目标types。 (未使用9.1之前的版本进行testing。)

  • array_to_string(ARRAY())string_agg()快。

  • 不需要variables,不需要DECLARE 。 更less的任务。

  • dynamicSQL中没有子查询。 ($1).field更快。

  • pg_typeof(_comp_val)::text::regclass
    和…一样
    (SELECT typrelid FROM pg_catalog.pg_type WHERE oid = pg_typeof($1)::oid)
    对于有效的复合types,速度更快。
    最后一个修改build立在pg_type.typname始终与注册复合types的关联pg_class.relname相同的假设上,并且双重转换可以replace子查询。 我在一个大型的数据库中运行这个testing来validation,并且如预期的那样空了:

     SELECT * FROM pg_catalog.pg_type t JOIN pg_namespace n ON n.oid = t.typnamespace WHERE t.typrelid > 0 -- exclude non-composite types AND t.typrelid IS DISTINCT FROM (quote_ident(n.nspname ) || '.' || quote_ident(typname))::regclass 
  • 使用INOUT参数可避免显式RETURN 。 这只是一个符号捷径。 Pavel不喜欢它,他更喜欢显式的RETURN语句…

一切放在一起,这几乎是以前的版本的两倍


原(过时)的答案:

结果是一个版本是〜2.25倍更快 。 但是如果没有Pavel的第二个版本,我可能无法做到这一点。

另外,这个版本通过在单个查询中做所有事情避免了大部分的文本转换 ,所以它应该更不容易出错。
PostgreSQL 9.0和9.1testing

 CREATE FUNCTION f_setfield(_comp_val anyelement, _field text, _val text) RETURNS anyelement AS $func$ DECLARE _list text; BEGIN _list := ( SELECT string_agg(x.fld, ',') FROM ( SELECT CASE WHEN a.attname = $2 THEN quote_literal($3) || '::'|| (SELECT quote_ident(typname) FROM pg_catalog.pg_type WHERE oid = a.atttypid) ELSE quote_ident(a.attname) END AS fld FROM pg_catalog.pg_attribute a WHERE a.attrelid = (SELECT typrelid FROM pg_catalog.pg_type WHERE oid = pg_typeof($1)::oid) AND a.attnum > 0 AND a.attisdropped = false ORDER BY a.attnum ) x ); EXECUTE 'SELECT ' || _list || ' FROM (SELECT $1.*) x' USING $1 INTO $1; RETURN $1; END $func$ LANGUAGE plpgsql STABLE; 

我写了setfield函数的第二个版本。 它在postgres 9.1上工作我没有在旧版本上testing它。 这不是一个奇迹(从性能angular度来看),但它更强大,比以前快了8倍。

 CREATE OR REPLACE FUNCTION public.setfield2(anyelement, text, text) RETURNS anyelement LANGUAGE plpgsql AS $function$ DECLARE _name text; _values text[]; _value text; _attnum int; BEGIN FOR _name, _attnum IN SELECT a.attname, a.attnum FROM pg_catalog.pg_attribute a WHERE a.attrelid = (SELECT typrelid FROM pg_type WHERE oid = pg_typeof($1)::oid) AND a.attnum > 0 LOOP IF _name = $2 THEN _value := $3; ELSE EXECUTE 'SELECT (($1).' || quote_ident(_name) || ')::text' INTO _value USING $1; END IF; _values[_attnum] := COALESCE('"' || replace(replace(_value, '"', '""'), '''', '''''') || '"', ''); END LOOP; EXECUTE 'SELECT (' || quote_ident(pg_typeof($1)::text) || ' ''(' || array_to_string(_values,',') || ')'').*' INTO $1; RETURN $1; END; $function$; 

更新/警告: Erwin指出,这是目前没有logging , 手册指出不应该可以这样改变logging。

改用hstore或Pavel的解决scheme 。

这个基于json的简单解决scheme几乎和hstore一样快,只需要Postgres 9.3或更新版本。 如果您不能使用hstore扩展,这应该是一个不错的select,性能差异应该可以忽略不计。 基准: https : //stackoverflow.com/a/28673542/1914376

a)我们可以通过cast / concat进行内联。 Json函数需要Postgres 9.3:

 SELECT json_populate_record( record , ('{"'||'key'||'":"'||'new-value'||'"}')::json ); 

b)或使用Postgres 9.4的函数内联。

 SELECT json_populate_record ( record ,json_object(ARRAY['key', 'new-value']) ); 

注:我select了json_object(ARRAY [key,value]),因为它比json_build_object(key,value)快一点:

要隐藏转换细节,您可以使用a)在一个函数中,几乎没有开销。

 CREATE FUNCTION x.setfield_json(in_element anyelement, key text, value text) RETURNS anyelement AS $BODY$ SELECT json_populate_record( in_element, ('{"'||key||'":"'||value||'"}')::json); $BODY$ LANGUAGE sql; 

plpgsql之外的“SELECT INTO”(在dynamicSQL上下文中)具有不同的意义 – 它将查询结果存储到表中。

任何领域的修改都是可能的,但并不简单

 CREATE OR REPLACE FUNCTION public.setfield(a anyelement, text, text) RETURNS anyelement LANGUAGE plpgsql AS $function$ begin create temp table aux as select $1.*; execute 'update aux set ' || quote_ident($2) || ' = ' || quote_literal($3); select into $1 * from aux; drop table aux; return $1; end; $function$ 

但是这个代码不是很有效 – 不可能在plpgsql中写得很好。 你可以find一些C库,应该这样做。

testing设置和基准v2

Erwin鼓励在此主题( https://stackoverflow.com/a/7782839/1914376 )中重现他的基准testing,所以我用合成testing数据修改了他的代码,并在我的答案中添加了hstore解决scheme和json解决scheme(和Pavel的json解决scheme在另一个线程中find)基准现在作为一个查询运行,使捕获结果更容易。

 DROP SCHEMA IF EXISTS x CASCADE; CREATE SCHEMA x; -- Pavel 1: -------------------------------------------------------------------------------------------------- CREATE OR REPLACE FUNCTION x.setfield(anyelement, text, text) RETURNS anyelement LANGUAGE plpgsql AS $function$ begin create temp table aux as select $1.*; execute 'update aux set ' || quote_ident($2) || ' = ' || quote_literal($3); select into $1 * from aux; drop table aux; return $1; end; $function$; -- Pavel 2 (with patches) -------------------------------------------------------------------------------------------------- CREATE OR REPLACE FUNCTION x.setfield2(anyelement, text, text) RETURNS anyelement LANGUAGE plpgsql AS $function$ DECLARE _name text; _values text[]; _value text; _attnum int; BEGIN FOR _name, _attnum IN SELECT a.attname, a.attnum FROM pg_catalog.pg_attribute a WHERE a.attrelid = (SELECT typrelid FROM pg_type WHERE oid = pg_typeof($1)::oid) LOOP IF _name = $2 THEN _value := $3; ELSE EXECUTE 'SELECT (($1).' || quote_ident(_name) || ')::text' INTO _value USING $1; END IF; _values[_attnum] := COALESCE('"' || replace(replace(_value, '"', '""'), '''', '''''') || '"', ''); END LOOP; EXECUTE 'SELECT (' || pg_typeof($1)::text || '''(' || array_to_string(_values,',') || ')'').*' INTO $1; RETURN $1; END; $function$; -- Erwin 1 -------------------------------------------------------------------------------------------------- CREATE OR REPLACE FUNCTION x.setfield3(anyelement, text, text) RETURNS anyelement AS $body$ DECLARE _list text; BEGIN _list := ( SELECT string_agg(x.fld, ',') FROM ( SELECT CASE WHEN a.attname = $2 THEN quote_literal($3) ELSE quote_ident(a.attname) END AS fld FROM pg_catalog.pg_attribute a WHERE a.attrelid = (SELECT typrelid FROM pg_type WHERE oid = pg_typeof($1)::oid) ORDER BY a.attnum ) x ); EXECUTE ' SELECT ' || _list || ' FROM (SELECT $1.*) x' USING $1 INTO $1; RETURN $1; END; $body$ LANGUAGE plpgsql; -- Erwin 2 -------------------------------------------------------------------------------------------------- CREATE OR REPLACE FUNCTION x.setfield4(INOUT _comp_val anyelement , _field text, _val text) RETURNS anyelement AS $func$ BEGIN EXECUTE 'SELECT ' || array_to_string(ARRAY( SELECT CASE WHEN attname = _field THEN '$2' ELSE '($1).' || quote_ident(attname) END AS fld FROM pg_catalog.pg_attribute WHERE attrelid = pg_typeof(_comp_val)::text::regclass AND attnum > 0 AND attisdropped = FALSE ORDER BY attnum ), ',') USING _comp_val, _val INTO _comp_val; END $func$ LANGUAGE plpgsql; -- Pavel 3: json. (Postgres 9.4) -- Found here: https://stackoverflow.com/a/28284491/1914376 -------------------------------------------------------------------------------------------------- CREATE OR REPLACE FUNCTION x.setfield5(r anyelement, fn text, val text,OUT result anyelement) RETURNS anyelement LANGUAGE plpgsql AS $function$ declare jo json; begin jo := (select json_object(array_agg(key), array_agg(case key when fn then val else value end)) from json_each_text(row_to_json(r))); result := json_populate_record(r, jo); end; $function$; -- Json. Use built-in json functions (Postgres 9.3) -- This is available from 9.3 since we create json by casting -- instead of using json_object/json_build_object only available from 9.4 -------------------------------------------------------------------------------------------------- CREATE FUNCTION x.setfield_json(in_element anyelement, key text, value text) RETURNS anyelement AS $BODY$ SELECT json_populate_record( in_element, ('{"'||key||'":"'||value||'"}')::json); $BODY$ LANGUAGE sql; -------------------------------------------------------------------------------------------------- -- Test setup -------------------------------------------------------------------------------------------------- -- composite type for tests. CREATE TYPE x.t_f as ( id int ,company text ,sort text ,log_up timestamp ,log_upby smallint ); -- Create temp table with synthetic test data DROP TABLE IF EXISTS tmp_f; CREATE TEMP table tmp_f AS SELECT ROW(i, 'company'||i, NULL, NULL, NULL)::x.t_f AS f FROM generate_series(1, 5000) S(i); -- Run the benchmark DO $$ DECLARE start_time timestamptz; test_count integer; test_description TEXT; BEGIN test_count := 200; test_description := 'setfield, Pavel 1: temptable'; start_time := clock_timestamp(); PERFORM x.setfield (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count; RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description; test_count := 5000; test_description := 'setfield2, Pavel 2: reflection'; start_time := clock_timestamp(); PERFORM x.setfield2 (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count; RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description; test_count := 5000; test_description := 'setfield3, Erwin 1: reflection'; start_time := clock_timestamp(); PERFORM x.setfield3 (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count; RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description; test_count := 5000; test_description := 'setfield4, Erwin 2: reflection'; start_time := clock_timestamp(); PERFORM x.setfield4 (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count; RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description; test_count := 5000; test_description := 'setfield5, Pavel 3: json (PG 9.4)'; start_time := clock_timestamp(); PERFORM x.setfield5 (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count; RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description; test_count := 5000; test_description := 'setfield_json, Geir 1: casting (PG 9.3)'; start_time := clock_timestamp(); PERFORM x.setfield_json (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count; RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description; --json_object(ARRAY(key,value]) is actually faster than json_build_object(key, value) test_count := 5000; test_description := 'no function/inlined: json_object (PG 9.4)'; start_time := clock_timestamp(); PERFORM json_populate_record( f, json_object(ARRAY['company', 'new-value'||md5(random()::text)] )) FROM tmp_f LIMIT test_count; RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description; test_count := 5000; test_description := 'no function/inlined: hstore (PG 9.0)'; start_time := clock_timestamp(); PERFORM f #= hstore('company', 'new-value'||md5(random()::text)) FROM tmp_f LIMIT test_count; RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description; END; $$; 

testing结果在9.4.1,win32,i5-4300U上

 NOTICE: Test took: 1138 ms (for 200 rows) Name: setfield, Pavel 1: temptable NOTICE: Test took: 652 ms (for 5000 rows) Name: setfield2, Pavel 2: reflection NOTICE: Test took: 364 ms (for 5000 rows) Name: setfield3, Erwin 1: reflection NOTICE: Test took: 275 ms (for 5000 rows) Name: setfield4, Erwin 2: reflection NOTICE: Test took: 192 ms (for 5000 rows) Name: setfield5, Pavel 3: json (PG 9.4) NOTICE: Test took: 23 ms (for 5000 rows) Name: setfield_json, Geir 1: casting (PG 9.3) NOTICE: Test took: 25 ms (for 5000 rows) Name: no function/inlined: json_object (PG 9.4) NOTICE: Test took: 14 ms (for 5000 rows) Name: no function/inlined: hstore (PG 9.0) 

2015年3月更新:
现在大大过时了。 考虑@Geir新的基准 ,更快的变体。


testing设置和基准

我提出了三个解决scheme(2011年10月16日),并对PostgreSQL 9.0进行了testing。 您可以在下面find完整的设置。 只有testing数据不包含在我使用的真实生活数据库(不是合成数据)。 它都被封装在自己的架构中,用于非侵入式的使用。

我想鼓励任何想重现testing的人。 也许与Postgres 9.1? 并在此添加结果? 🙂

 -- DROP SCHEMA x CASCADE; CREATE SCHEMA x; -- Pavel 1 CREATE OR REPLACE FUNCTION x.setfield(anyelement, text, text) RETURNS anyelement LANGUAGE plpgsql AS $function$ begin create temp table aux as select $1.*; execute 'update aux set ' || quote_ident($2) || ' = ' || quote_literal($3); select into $1 * from aux; drop table aux; return $1; end; $function$; -- Pavel 2 (with patches) CREATE OR REPLACE FUNCTION x.setfield2(anyelement, text, text) RETURNS anyelement LANGUAGE plpgsql AS $function$ DECLARE _name text; _values text[]; _value text; _attnum int; BEGIN FOR _name, _attnum IN SELECT a.attname, a.attnum FROM pg_catalog.pg_attribute a WHERE a.attrelid = (SELECT typrelid FROM pg_type WHERE oid = pg_typeof($1)::oid) LOOP IF _name = $2 THEN _value := $3; ELSE EXECUTE 'SELECT (($1).' || quote_ident(_name) || ')::text' INTO _value USING $1; END IF; _values[_attnum] := COALESCE('"' || replace(replace(_value, '"', '""'), '''', '''''') || '"', ''); END LOOP; EXECUTE 'SELECT (' || pg_typeof($1)::text || '''(' || array_to_string(_values,',') || ')'').*' INTO $1; RETURN $1; END; $function$; -- Erwin 1 CREATE OR REPLACE FUNCTION x.setfield3(anyelement, text, text) RETURNS anyelement AS $body$ DECLARE _list text; BEGIN _list := ( SELECT string_agg(x.fld, ',') FROM ( SELECT CASE WHEN a.attname = $2 THEN quote_literal($3) ELSE quote_ident(a.attname) END AS fld FROM pg_catalog.pg_attribute a WHERE a.attrelid = (SELECT typrelid FROM pg_type WHERE oid = pg_typeof($1)::oid) ORDER BY a.attnum ) x ); EXECUTE ' SELECT ' || _list || ' FROM (SELECT $1.*) x' USING $1 INTO $1; RETURN $1; END; $body$ LANGUAGE plpgsql; -- composite type for tests. CREATE TYPE x.t_f as ( id int ,company text ,sort text ,log_up timestamp ,log_upby smallint ); -- temp table with real life test data DROP TABLE IF EXISTS tmp_f; CREATE TEMP table tmp_f AS SELECT ROW(firma_id,firma,sort,log_up,log_upby)::x.t_f AS f FROM ef.firma WHERE firma !~~ '"%'; -- SELECT count(*) FROM tmp_f; -- 5183 -- Quick test: results are identical? SELECT *, x.setfield (f, 'company','test') ,x.setfield2(f, 'company','test') ,x.setfield3(f, 'company','test') FROM tmp_f LIMIT 10; 

基准

我运行了几次查询来填充caching。 所呈现的结果是使用EXPLAIN ANALYZE的五个运行时间中的最好结果。

第一轮1000行

Pavel的第一个原型最大化共享内存与更多的行。

Pavel 1:2445.112 ms

 SELECT x.setfield (f, 'company','test') FROM tmp_f limit 1000; 

帕维尔2:263.753毫秒

 SELECT x.setfield2(f, 'company','test') FROM tmp_f limit 1000; 

欧文1:120.671毫秒

 SELECT x.setfield3(f, 'company','test') FROM tmp_f limit 1000; 

另一个testing5183行。

Pavel 2:1327.429 ms

 SELECT x.setfield2(f, 'company','test') FROM tmp_f; 

Erwin1:588.691 ms

 SELECT x.setfield3(f, 'company','test') FROM tmp_f;