I’ve written a SQL code generator which automates the mindless, repetitive task of joining a table to the tables to which it has foreign keys.
Generating this code isn’t exactly trivial, so I’ve decomposed the problem. I start with the standard information_schema foreign key column query:
create or replace view generator.foreign_key_columns as
select
fk.constraint_name as fk_constraint_name
,fk.constraint_schema
,fk.table_name as fk_table_name
,fk.column_name as fk_column_name
,fk.ordinal_position as fk_ordinal_position
,pk.constraint_name as referenced_constraint_name
,pk.table_name as referenced_table_name
,pk.constraint_schema as referenced_schema_name
,pk.column_name as referenced_column_name
,pk.ordinal_position as referenced_ordinal_position
from information_schema.referential_constraints as rc
inner join information_schema.key_column_usage as fk
on fk.constraint_catalog = rc.constraint_catalog
and fk.constraint_schema = rc.constraint_schema
and fk.constraint_name = rc.constraint_name
inner join information_schema.key_column_usage as pk
on pk.constraint_catalog = rc.unique_constraint_catalog
and pk.constraint_schema = rc.unique_constraint_schema
and pk.constraint_name = rc.unique_constraint_name
and pk.ordinal_position = fk.ordinal_position;
We’re going to want to avoid duplicating the join columns in the select list, so the next step is to identify the non-join columns:
create or replace view generator.non_foreign_key_columns as
select table_schema
,table_name
,column_name
from information_schema.columns
where (table_schema, table_name, column_name) not in (
select constraint_schema
, fk_table_name
, fk_column_name
from foreign_key_columns
);
Next we group columns so that there is one row per foreign key, with the join columns structured as a join clause and the non-join columns listed.
create or replace view generator.foreign_key_joins as
with recursive join_columns as (
select
constraint_schema as root_schema
, fk_table_name as root_table_name
, fk_table_name as fk_table_alias
, constraint_schema
, referenced_table_name
, referenced_schema_name
, fk_constraint_name
, referenced_column_name
, fk_column_name
, 0 as level
from generator.foreign_key_columns fks
union all
select
join_columns.root_schema
, join_columns.root_table_name
, join_columns.fk_constraint_name as fk_table_alias
, fks.constraint_schema
, fks.referenced_table_name
, fks.referenced_schema_name
, fks.fk_constraint_name
, fks.referenced_column_name
, fks.fk_column_name
, level + 1 as level
from join_columns
inner join generator.foreign_key_columns fks
on join_columns.referenced_table_name = fks.fk_table_name
and join_columns.referenced_schema_name = fks.constraint_schema
)
select root_schema
, root_table_name
, fk_table_alias
, referenced_table_name
, referenced_schema_name
, fk_constraint_name
, 'left outer join "'|| referenced_schema_name ||'"."'|| referenced_table_name ||'" as "'|| fk_constraint_name ||'"
on '|| string_agg('"'||fk_table_alias ||'"."'||fk_column_name ||'" = "'|| fk_constraint_name ||'"."'||referenced_column_name ||'"','and') as join_sql
/* get other columns as a sub-select so they don't affect the cardinality of the view */
,(
select string_agg('"'||fk_constraint_name||'"."'||non_foreign_key_columns.column_name||'" as "'||fk_constraint_name||'_'||non_foreign_key_columns.column_name||'"',chr(10)||',')
from generator.non_foreign_key_columns
where non_foreign_key_columns.table_schema = referenced_schema_name
and non_foreign_key_columns.table_name = referenced_table_name
) as non_fk_columns
from join_columns
group by root_schema
, root_table_name
, fk_table_alias
, referenced_table_name
, referenced_schema_name
, fk_constraint_name;
This query is more complex than I like, but it is the best that I can do.
With the joins identified creating the views is just another aggregation:
create or replace view generator.ancestor_views as
select root_schema
root_table_name
,'create or replace view "'||root_schema ||'"."'||root_table_name ||'_av" as
select ' || (
select string_agg('"'||root_table_name||'"."'||root_columns.column_name||'"',chr(10)||',')
from generator.non_foreign_key_columns root_columns
where root_columns.table_schema = root_schema
and root_columns.table_name = root_table_name
)
||','|| string_agg(non_fk_columns,chr(10)||',') || '
from "'||root_schema ||'"."'||root_table_name ||'"
'|| string_agg(join_sql,chr(10)) as view_sql
from generator.foreign_key_joins
group by root_schema
, root_table_name;
I call these views, ancestor views. The _av suffix should be adjusted to taste. Having a special schema with an _av suffix also works well, if you’re not already overloaded with schemas for things like concurrent releases.
Once we have generated these views, we can write SQL at a higher level of abstraction. Our code can be concise and limited to our intent, rather than being cluttered by the joins. Queries like the following can be written without being cluttered by dozens of lines of code joining many tables.
SELECT customer_name, email_address FROM order_lines_av WHERE product_recall_flag = true
Give this a try. I hope you like it.
PS – all of the example code is for postgreSQL 9.6. Converting it to other databases that support information_schema (e.g. MS SQL Server) is simple, mostly a matter of converting the string_agg to whatever your database supports. For non-information schema databases (e.g. Oracle) you’ll need to use the proprietary data dictionary (e.g. all_cons_columns ).
Be the first to comment.