如何search所有表(PostgreSQL)中的特定值?
类似的问题可以在这里find ,但是它是针对Oracle的。 我对PostgreSQL有同样的问题。
简而言之,是否可以在PostgreSQL中search每个表的每个字段以获取特定的值?
谢谢。
如何倾销数据库的内容,然后使用grep
?
$ pg_dump --data-only --inserts -U postgres your-db-name > a.tmp $ grep United a.tmp INSERT INTO countries VALUES ('US', 'United States'); INSERT INTO countries VALUES ('GB', 'United Kingdom');
相同的实用程序pg_dump可以在输出中包含列名称。 只需更改--column-inserts
到--column-inserts
。 这样,您也可以search特定的列名称。 但是,如果我正在寻找列名,我可能会转储模式,而不是数据。
$ pg_dump --data-only --column-inserts -U postgres your-db-name > a.tmp $ grep country_code a.tmp INSERT INTO countries (iso_country_code, iso_country_name) VALUES ('US', 'United States'); INSERT INTO countries (iso_country_code, iso_country_name) VALUES ('GB', 'United Kingdom');
这是一个pl / pgsql函数,用于查找任何列包含特定值的logging。 它以文本格式search的值作为参数,要search的表格名称(默认为所有表格)和一组模式名称(默认为所有模式名称)。
它返回一个表格结构,表格名称,列名称和伪列ctid
(表格中行的非持久物理位置,请参见系统列 )
CREATE OR REPLACE FUNCTION search_columns( needle text, haystack_tables name[] default '{}', haystack_schema name[] default '{}' ) RETURNS table(schemaname text, tablename text, columnname text, rowctid text) AS $$ begin FOR schemaname,tablename,columnname IN SELECT c.table_schema,c.table_name,c.column_name FROM information_schema.columns c JOIN information_schema.tables t ON (t.table_name=c.table_name AND t.table_schema=c.table_schema) WHERE (c.table_name=ANY(haystack_tables) OR haystack_tables='{}') AND (c.table_schema=ANY(haystack_schema) OR haystack_schema='{}') AND t.table_type='BASE TABLE' LOOP EXECUTE format('SELECT ctid FROM %I.%I WHERE cast(%I as text)=%L', schemaname, tablename, columnname, needle ) INTO rowctid; IF rowctid is not null THEN RETURN NEXT; END IF; END LOOP; END; $$ language plpgsql;
编辑 :这个代码是为PG 9.1或更新版本。
testing数据库中使用的示例:
在公共模式下的所有表中search:
select * from search_columns('foobar'); schemaname | 表名| 列名| rowctid ------------ ----------- + + ------------ + --------- 公共| s3 | usename | (0,11) 公共| s2 | relname | (7,29) 公共| w | 身体| (0,2) (3排)
在特定的表格中search:
select * from search_columns('foobar','{w}'); schemaname | 表名| 列名| rowctid ------------ ----------- + + ------------ + --------- 公共| w | 身体| (0,2) (1排)
search从select中获取的表的子集:
select * from search_columns('foobar',array(select table_name :: name from information_schema.tables where table_name like's''),array ['public']); schemaname | 表名| 列名| rowctid ------------ ----------- + + ------------ + --------- 公共| s2 | relname | (7,29) 公共| s3 | usename | (0,11) (2行)
使用相应的基表和和ctid获取结果行:
select * from public.w where ctid ='(0,2)'; 标题| 身体| TSV ------- -------- + + --------------------- toto | foobar | 'foobar':2'toto':1
要再次testing一个正则expression式而不是严格的平等,就像grep一样:
SELECT ctid FROM %I.%I WHERE cast(%I as text)=%L
可能会更改为:
SELECT ctid FROM %I.%I WHERE cast(%I as text) ~ %L
我所知道的唯一可以做到的工具是:SQL Workbench / J: http : //www.sql-workbench.net/
一个基于Java / JDBC的工具,它提供了一个特殊的(专有的)SQL“命令”来search数据库中的所有(或刚刚select的)表:
http://www.sql-workbench.net/manual/wb-commands.html#command-search-data
http://www.sql-workbench.net/wbgrepdata_png.html
如果有人认为它可以帮助。 这里是@DanielVérité的函数,另一个参数可以接受在search中可以使用的列的名字。 这样就减less了处理时间。 至less在我的testing中,它减less了很多。
CREATE OR REPLACE FUNCTION search_columns( needle text, haystack_columns name[] default '{}', haystack_tables name[] default '{}', haystack_schema name[] default '{public}' ) RETURNS table(schemaname text, tablename text, columnname text, rowctid text) AS $$ begin FOR schemaname,tablename,columnname IN SELECT c.table_schema,c.table_name,c.column_name FROM information_schema.columns c JOIN information_schema.tables t ON (t.table_name=c.table_name AND t.table_schema=c.table_schema) WHERE (c.table_name=ANY(haystack_tables) OR haystack_tables='{}') AND c.table_schema=ANY(haystack_schema) AND (c.column_name=ANY(haystack_columns) OR haystack_columns='{}') AND t.table_type='BASE TABLE' LOOP EXECUTE format('SELECT ctid FROM %I.%I WHERE cast(%I as text)=%L', schemaname, tablename, columnname, needle ) INTO rowctid; IF rowctid is not null THEN RETURN NEXT; END IF; END LOOP; END; $$ language plpgsql;
贝娄是上面创build的search_function的一个例子。
SELECT * FROM search_columns('86192700' , array(SELECT DISTINCT a.column_name::name FROM information_schema.columns AS a INNER JOIN information_schema.tables as b ON (b.table_catalog = a.table_catalog AND b.table_schema = a.table_schema AND b.table_name = a.table_name) WHERE a.column_name iLIKE '%cep%' AND b.table_type = 'BASE TABLE' AND b.table_schema = 'public' ) , array(SELECT b.table_name::name FROM information_schema.columns AS a INNER JOIN information_schema.tables as b ON (b.table_catalog = a.table_catalog AND b.table_schema = a.table_schema AND b.table_name = a.table_name) WHERE a.column_name iLIKE '%cep%' AND b.table_type = 'BASE TABLE' AND b.table_schema = 'public') );
这里是@DanielVérité的进展报告function。 它通过三种方式报告进展情况:
- 由RAISE NOTICE提供;
- 通过将所提供的{progress_seq}序列的值从{总search列数}减less到0;
- 通过将进度和find的表一起写入位于c:\ windows \ temp \ {progress_seq} .txt中的文本文件。
_
CREATE OR REPLACE FUNCTION search_columns( needle text, haystack_tables name[] default '{}', haystack_schema name[] default '{public}', progress_seq text default NULL ) RETURNS table(schemaname text, tablename text, columnname text, rowctid text) AS $$ DECLARE currenttable text; columnscount integer; foundintables text[]; foundincolumns text[]; begin currenttable=''; columnscount = (SELECT count(1) FROM information_schema.columns c JOIN information_schema.tables t ON (t.table_name=c.table_name AND t.table_schema=c.table_schema) WHERE (c.table_name=ANY(haystack_tables) OR haystack_tables='{}') AND c.table_schema=ANY(haystack_schema) AND t.table_type='BASE TABLE')::integer; PERFORM setval(progress_seq::regclass, columnscount); FOR schemaname,tablename,columnname IN SELECT c.table_schema,c.table_name,c.column_name FROM information_schema.columns c JOIN information_schema.tables t ON (t.table_name=c.table_name AND t.table_schema=c.table_schema) WHERE (c.table_name=ANY(haystack_tables) OR haystack_tables='{}') AND c.table_schema=ANY(haystack_schema) AND t.table_type='BASE TABLE' LOOP EXECUTE format('SELECT ctid FROM %I.%I WHERE cast(%I as text)=%L', schemaname, tablename, columnname, needle ) INTO rowctid; IF rowctid is not null THEN RETURN NEXT; foundintables = foundintables || tablename; foundincolumns = foundincolumns || columnname; RAISE NOTICE 'FOUND! %, %, %, %', schemaname,tablename,columnname, rowctid; END IF; IF (progress_seq IS NOT NULL) THEN PERFORM nextval(progress_seq::regclass); END IF; IF(currenttable<>tablename) THEN currenttable=tablename; IF (progress_seq IS NOT NULL) THEN RAISE NOTICE 'Columns left to look in: %; looking in table: %', currval(progress_seq::regclass), tablename; EXECUTE 'COPY (SELECT unnest(string_to_array(''Current table (column ' || columnscount-currval(progress_seq::regclass) || ' of ' || columnscount || '): ' || tablename || '\n\nFound in tables/columns:\n' || COALESCE( (SELECT string_agg(c1 || '/' || c2, '\n') FROM (SELECT unnest(foundintables) AS c1,unnest(foundincolumns) AS c2) AS t1) , '') || ''',''\n''))) TO ''c:\WINDOWS\temp\' || progress_seq || '.txt'''; END IF; END IF; END LOOP; END; $$ language plpgsql;
在不存储新过程的情况下,您可以使用代码块并执行以获取出现的表格。 您可以按架构,表或列名称筛选结果。
DO $$ DECLARE value int := 0; sql text := 'The constructed select statement'; rec1 record; rec2 record; BEGIN DROP TABLE IF EXISTS _x; CREATE TEMPORARY TABLE _x ( schema_name text, table_name text, column_name text, found text ); FOR rec1 IN SELECT table_schema, table_name, column_name FROM information_schema.columns WHERE table_name <> '_x' AND UPPER(column_name) LIKE UPPER('%%') AND table_schema <> 'pg_catalog' AND table_schema <> 'information_schema' AND data_type IN ('character varying', 'text', 'character', 'char', 'varchar') LOOP sql := concat('SELECT ', rec1."column_name", ' AS "found" FROM ',rec1."table_schema" , '.',rec1."table_name" , ' WHERE UPPER(',rec1."column_name" , ') LIKE UPPER(''','%my_substring_to_find_goes_here%' , ''')'); RAISE NOTICE '%', sql; BEGIN FOR rec2 IN EXECUTE sql LOOP RAISE NOTICE '%', sql; INSERT INTO _x VALUES (rec1."table_schema", rec1."table_name", rec1."column_name", rec2."found"); END LOOP; EXCEPTION WHEN OTHERS THEN END; END LOOP; END; $$; SELECT * FROM _x;