Joins and Result Trees

While not an O/RM, Massive has some O/RM-esque capabilities when it comes to working with multiple relations. The Readable.join() method creates a compound entity representing the origin table or view with additional entities included via JOIN clauses. The compound entity's query and persistence methods may then retrieve and filter by information in the subsidiary relations, and in the case of query methods, the results are decomposed into object trees mirroring the join structure. Decomposition may also be used directly, for example on complex multi-table views.

Readable.join ()

As of Massive 6.0.0, instances of Readable have a join method which builds a "compound" entity from a definition document (very similar to a decomposition schema, if you've used those). The compound entity's Readable and, if applicable, Writable methods target multiple relations by building queries with JOIN clauses.

// list East Virginian libraries having one or more of
// Italo Calvino's works in their collection; each
// library further has a list of Calvino's `books`.

const librariesWithBooks = await db.libraries.join({
  books: {
    type: 'INNER',
    on: {library_id: 'id'}
  }
}).find({
  state: 'EV',
  'books.author ILIKE': 'calvino, %'
});

Readable.join itself is a synchronous function. It does not need to be awaited and can be chained immediately into an API method such as find, insert, update, or destroy.

The compound entity returned by Readable.join is cached on the Database object. Future invocations with the same join definition will use the cached entity to avoid redundant processing.

If you're interested in a look behind the curtain, check out this post. For documentation and examples, read on.

Join Definitions

Join definitions take the form of object trees. Every key at the root level names or aliases a relation, and each object value for these keys describes how that relation is to be joined. Definitions can be nested to arbitrary depths, and Massive can fill in a lot of the details automatically, recognizing primary key columns and even following some foreign key relationships. Depending on the circumstances, any or all of the join properties themselves may be optional.

Let's add to the join query above: we want to know who's checked these books out. This is what the full definition would look like, with even the optional properties specified:

{
  books: {
    type: 'INNER',
    relation: 'books',
    pk: 'id',
    on: {library_id: 'id'},
    omit: false,
    patron_books: {
      type: 'LEFT OUTER',
      relation: 'patron_books',
      pk: ['patron_id', 'book_id'],
      on: {book_id: 'books.id'},
      omit: true
    },
    who_checked_out: {
      type: 'LEFT OUTER',
      relation: 'patrons',
      pk: 'id',
      on: {id: 'patron_books.patron_id'},
      omit: false
    }
  }
}

Here, libraries is the origin relation, while books is a joined relation; libraries may also be thought of as books's parent. patron_books is a junction table, with a compound primary key; since it would be redundant if included in the output, it has been excluded with the omit option. Finally, who_checked_out is an alias for the patrons table. Patrons will be included in an array at the path books[n].who_checked_out.

In this instance, who_checked_out could have been nested inside patron_books instead of being placed under books. As long as the on is explicitly specified, it doesn't particularly matter which option you choose. However, Massive also analyzes foreign key relationships and can automatically fill in on values if there's a single, unambiguous relationship expressed in the database schema. If you're relying on this, the placement matters: who_checked_out should be nested under patron_books so it can be matched to the other side of the foreign key relationship correctly.

Relations in database schemas off your current search path (ordinarily the public schema) may but do not usually need to be aliased: names such as myschema.mytable are valid keys in the join definition. Records from myschema.mytable will appear under the name mytable in the final output. If there are multiple mytables in multiple schemas involved, at least one must be aliased to avoid collision.

Field references, whether in the join schema itself or in criteria or order objects passed to methods on a compound entity, must include the alias or name of the relation to which they belong, unless they belong to the origin. Aliases also take precedence over names. If you've aliased a relation in a similar manner to who_checked_out, reference it by the alias to avoid confusion. The records from an aliased relation appear under that alias in the final output.

Join conditions in on may specify constants and values in JSON fields (on either side) as well:

on: {
  library_id: 'id',
  still_in_collection: true,
  'metadata_json.cover.type': 'hardcover'
}

Definition Properties

Key Type Description
type String The JOIN type; defaults to INNER.
on Object Maps fields in the joined relation to fields in the origin or parent relation.
relation String The actual relation name, if using an alias.
pk String/Array The field or fields uniquely identifying one record for a relation without an explicit primary key constraint (e.g. views).
decomposeTo String Joined records are ordinarily decomposed into arrays. Set to object to decompose into a singular object instead.
omit Boolean true to exclude this relation from the output. Useful for junction tables.

Defaults and Shortcuts

The on property is optional if one and only one foreign key relationship exists between the relation and its parent. Both relations must be tables in order to create such a foreign key.

The same example join definition could thus be expressed more succinctly, assuming all primary and foreign keys are in place:

{
  books: {
    patron_books: {
      type: 'LEFT OUTER',
      omit: true
      who_checked_out: {
        type: 'LEFT OUTER',
        relation: 'patrons'
      }
    }
  }
}

Note that who_checked_out is nested under patron_books, unlike the previous example, in order to take advantage of foreign key detection.

The bare minimum for a simple inner join along a foreign key relationship is an empty object, accepting all defaults and introspected settings:

const librariesWithBooks = await db.libraries.join({
  books: {}
}).find(...);

Or, for a single joined relation, the name alone suffices:

const librariesWithBooks = await db.libraries
  .join('books')
  .find(...);

Persistence

A compound entity produced by invoking join() on a Writable will naturally possess the Writable suite of persistence methods including insert, update, save, and destroy. There are some functional limitations to compound Writables: the origin relation is the primary or only target, and only tuples from the origin relation are returned -- there's no tree built from related records.

insert on a compound Writable functions exactly as deep insert. Match descendants to the join schema and provide an explicit undefined where the primary key of the origin record is to be filled in.

const librariesWithBooks = db.libraries.join('books');
const newLibrary = await librariesWithBooks.insert({
  name: 'Lichfield Public Library',
  state: 'EV',
  books: [{
    library_id: undefined,
    title: 'Jurgen: A Comedy of Justice',
    author: 'Cabell, James Branch'
  }, {
    library_id: undefined,
    title: 'If On a Winter\'s Night a Traveller',
    author: 'Calvino, Italo'
  }]
});

update and destroy modify only the origin relation, but can apply criteria which filter based on joined relations.

const librariesWithBooks = db.libraries.join('books');
const withCabell = await librariesWithBooks.update({
  'books.author ilike': 'cabell, %'
}, {
  has_cabell: true
});
// note that following a foreign key in the opposite
// direction -- deleting libraries based on a patron or
// book, where patrons or books have `library_id`s --
// will fail since the foreign key constraint would be
// violated

const libraryMembers = db.patrons.join('libraries');
const iplPatrons = await libraryMembers.destroy({
  'libraries.name ilike': 'Imaginary Public Library'
});

save continues to function either as insert or update, depending on whether a primary key value is supplied for the origin relation. However, it is less useful since inserts can target multiple tables à la deep insert while updates cannot.

Caveats

  • INNER and LEFT OUTER are the only fully supported join types. Fortunately, they're also by far the most commonly used join types. The principal reason for this is that decomposition depends on having non-null tuples for the origin relation in all records, which neither RIGHT OUTER nor FULL OUTER joins guarantee. CROSS JOIN, being rather rare in application contexts, has simply not been implemented.
  • findOne is useless with one:many joins because it applies a LIMIT 1 clause to all involved relations, and so will discard supernumerary results.
  • update and destroy cannot be used if a joined relation other than the origin has implicit or explicit on conditions which reference the origin table, due to limitations in Postgres itself.
  • count is unreliable with one:many joins. Count the origin relation alone, or use SQL for inner joins and filtering by non-origin fields.
  • The fields option cannot be used to restrict the resultset of join queries, since it destroys the aliasing used to track columns.
  • The limit option is unreliable with one:many joins, having especially dire consequences for offset-limit pagination.

Standalone Resultset Decomposition

You may already have or be designing a view, script, query, or database function which compiles results from multiple relations, and want to transform data from an existing Massive API endpoint in the same manner as compound entities structure output data into trees. This is supported through the decompose query option, which indicates a schema similar to that passed to Readable.join().

Property Type Description
pk String/Array Specify a returned field or fields uniquely identifying a single record for this node in the decomposition schema.
columns Array/Object An array of field names, or an object mapping returned field names (keys) to their final names in the output.
decomposeTo String Nested values are ordinarily decomposed into arrays. Set to object to decompose them into a singular object instead.

Any other key on a schema is taken to represent a nested schema, and nested schemas may not be named with one of the reserved keys. If a schema contains both a column and a descendant schema with the same name, the descendant schema wins and overwrites the column value in the final output.

The following schema:

const usersWithTests = await db.user_tests.find({}, {
  decompose: {
    pk: 'user_id',
    columns: ['user_id', 'login'],
    tests: {
      pk: 'test_id',
      columns: { test_id: 'id', name: 'name' }
    }
  }
});

will transform this recordset:

[{ user_id: 1, login: 'alice', test_id: 1, name: 'one' },
 { user_id: 1, login: 'alice', test_id: 2, name: 'two' },
 { user_id: 2, login: 'bob', test_id: 3, name: 'three' }]

into this:

[{
  user_id: 1,
  login: 'alice',
  tests: [{ id: 1, name: 'one' },
          { id: 2, name: 'two' }]
}, {
  user_id: 2,
  login: 'bob',
  tests: [{ id: 3, name: 'three' }]
}]

This can also be used with raw SQL through db.query. Note that options need to be passed as the third argument, as the params array is still required in the second argument.

const usersWithTests = await db.query(
  `select u.id as u_id, u.name as u_name,
    u.address as u_address,
    t.id as t_id, t.score as t_score
    from users u
    inner join tests t on t.user_id = u.id`,
  [],
  {
    decompose: {
      pk: 'id',
      columns: {
        u_id: 'id',
        u_name: 'name',
        u_address: 'address'
      },
      tests: {
        pk: 't_id',
        columns: {t_id: 'id', t_score: 'score'}
      }
    }
  }
);

The decompose option can be applied to any result set, although it will generally be most useful with views and scripts.