Posted by on October 18, 2017

I’ve written a SQL code generator which automates the mindless, repetitive task of joining a table to the tables to which it has foreign keys.

Generating this code isn’t exactly trivial, so I’ve decomposed the problem. I start with the standard information_schema foreign key column query:

We’re going to want to avoid duplicating the join columns in the select list, so the next step is to identify the non-join columns:

Next we group columns so that there is one row per foreign key, with the join columns structured as a join clause and the non-join columns listed.

This query is more complex than I like, but it is the best that I can do.

  1. There is a recursive Common Table Expression to find all of the remote ancestors of a table.
  2. The foreign key columns are aggregated into a SQL string after the CTE.
  3. The join is left outer so that the database can optimise the query and prune tables whose columns aren’t referenced.
  4. The tables are aliased in case the same table is referenced more than once. You can remove the table
    aliases if that doesn’t occur in your database.
  5. The column list is sub-select in the select clause so that it doesn’t modify the number of rows returned.
  6. The columns are aliased in case their names aren’t unique, or if the same table is referenced more than once. You can remove the column
    aliases if neither of those apply
  7. If there is a cycle of foreign keys in your database, where A references B which then references A, then the this view might not be what you expect.

With the joins identified creating the views is just another aggregation:

I call these views, ancestor views. The _av suffix should be adjusted to taste. Having a special schema with an _av suffix also works well, if you’re not already overloaded with schemas for things like concurrent releases.

Once we have generated these views, we can write SQL at a higher level of abstraction. Our code can be concise and limited to our intent, rather than being cluttered by the joins. Queries like the following can be written without being cluttered by dozens of lines of code joining many tables.

Give this a try. I hope you like it.

PS – all of the example code is for postgreSQL 9.6. Converting it to other databases that support information_schema (e.g. MS SQL Server) is simple, mostly a matter of converting the string_agg to whatever your database supports. For non-information schema databases (e.g. Oracle) you’ll need to use the proprietary data dictionary (e.g. all_cons_columns ).

Posted in: Technical


Be the first to comment.

Leave a Reply