Joins and Result Trees
While not an O/RM, Massive has some O/Readable.join()
method creates a compound entity representing the origin table or view with additional entities included via JOIN
clauses. The compound entity's query and persistence methods may then retrieve and filter by information in the subsidiary relations, and in the case of query methods, the results are decomposed into object trees mirroring the join structure. Decomposition may also be used directly, for example on complex multi-table views.
Readable.join ()
As of Massive 6.0.0, instances of Readable
have a join
method which builds a "compound" entity from a definition document (very similar to a decomposition schema, if you've used those). The compound entity's Readable
and, if applicable, Writable
methods target multiple relations by building queries with JOIN
clauses.
// list East Virginian libraries having one or more of
// Italo Calvino's works in their collection; each
// library further has a list of Calvino's `books`.
const librariesWithBooks = await db.libraries.join({
books: {
type: 'INNER',
on: {library_id: 'id'}
}
}).find({
state: 'EV',
'books.author ILIKE': 'calvino, %'
});
Readable.join
itself is a synchronous function. It does not need to be await
ed and can be chained immediately into an API method such as find
, insert
, update
, or destroy
.
The compound entity returned by Readable.join
is cached on the Database
object. Future invocations with the same join definition will use the cached entity to avoid redundant processing.
If you're interested in a look behind the curtain, check out this post. For documentation and examples, read on.
Join Definitions
Join definitions take the form of object trees. Every key at the root level names or aliases a relation, and each object value for these keys describes how that relation is to be joined. Definitions can be nested to arbitrary depths, and Massive can fill in a lot of the details automatically, recognizing primary key columns and even following some foreign key relationships. Depending on the circumstances, any or all of the join properties themselves may be optional.
Let's add to the join query above: we want to know who's checked these books out. This is what the full definition would look like, with even the optional properties specified:
{
books: {
type: 'INNER',
relation: 'books',
pk: 'id',
on: {library_id: 'id'},
omit: false,
patron_books: {
type: 'LEFT OUTER',
relation: 'patron_books',
pk: ['patron_id', 'book_id'],
on: {book_id: 'books.id'},
omit: true
},
who_checked_out: {
type: 'LEFT OUTER',
relation: 'patrons',
pk: 'id',
on: {id: 'patron_books.patron_id'},
omit: false
}
}
}
Here, libraries
is the origin relation, while books
is a joined relation; libraries
may also be thought of as books
's parent. patron_books
is a junction table, with a compound primary key; since it would be redundant if included in the output, it has been excluded with the omit
option. Finally, who_checked_out
is an alias for the patrons
table. Patrons will be included in an array at the path books[n].who_checked_out
.
In this instance, who_checked_out
could have been nested inside patron_books
instead of being placed under books
. As long as the on
is explicitly specified, it doesn't particularly matter which option you choose. However, Massive also analyzes foreign key relationships and can automatically fill in on
values if there's a single, unambiguous relationship expressed in the database schema. If you're relying on this, the placement matters: who_checked_out
should be nested under patron_books
so it can be matched to the other side of the foreign key relationship correctly.
Relations in database schemas off your current search path (ordinarily the public
schema) may but do not usually need to be aliased: names such as myschema.mytable
are valid keys in the join definition. Records from myschema.mytable
will appear under the name mytable
in the final output. If there are multiple mytable
s in multiple schemas involved, at least one must be aliased to avoid collision.
Field references, whether in the join schema itself or in criteria or order objects passed to methods on a compound entity, must include the alias or name of the relation to which they belong, unless they belong to the origin. Aliases also take precedence over names. If you've aliased a relation in a similar manner to who_checked_out
, reference it by the alias to avoid confusion. The records from an aliased relation appear under that alias in the final output.
Join conditions in on
may specify constants and values in JSON fields (on either side) as well:
on: {
library_id: 'id',
still_in_collection: true,
'metadata_json.cover.type': 'hardcover'
}
Definition Properties
Key | Type | Description |
---|---|---|
type | String | The JOIN type; defaults to INNER . |
on | Object | Maps fields in the joined relation to fields in the origin or parent relation. |
relation | String | The actual relation name, if using an alias. |
pk | String/Array | The field or fields uniquely identifying one record for a relation without an explicit primary key constraint (e.g. views). |
decomposeTo | String | Joined records are ordinarily decomposed into arrays. Set to object to decompose into a singular object instead. |
omit | Boolean | true to exclude this relation from the output. Useful for junction tables. |
Defaults and Shortcuts
The on
property is optional if one and only one foreign key relationship exists between the relation and its parent. Both relations must be tables in order to create such a foreign key.
The same example join definition could thus be expressed more succinctly, assuming all primary and foreign keys are in place:
{
books: {
patron_books: {
type: 'LEFT OUTER',
omit: true
who_checked_out: {
type: 'LEFT OUTER',
relation: 'patrons'
}
}
}
}
Note that who_checked_out
is nested under patron_books
, unlike the previous example, in order to take advantage of foreign key detection.
The bare minimum for a simple inner join along a foreign key relationship is an empty object, accepting all defaults and introspected settings:
const librariesWithBooks = await db.libraries.join({
books: {}
}).find(...);
Or, for a single joined relation, the name alone suffices:
const librariesWithBooks = await db.libraries
.join('books')
.find(...);
Persistence
A compound entity produced by invoking join()
on a Writable
will naturally possess the Writable
suite of persistence methods including insert
, update
, save
, and destroy
. There are some functional limitations to compound Writable
s: the origin relation is the primary or only target, and only tuples from the origin relation are returned -- there's no tree built from related records.
insert
on a compound Writable
functions exactly as deep insert. Match descendants to the join schema and provide an explicit undefined
where the primary key of the origin record is to be filled in.
const librariesWithBooks = db.libraries.join('books');
const newLibrary = await librariesWithBooks.insert({
name: 'Lichfield Public Library',
state: 'EV',
books: [{
library_id: undefined,
title: 'Jurgen: A Comedy of Justice',
author: 'Cabell, James Branch'
}, {
library_id: undefined,
title: 'If On a Winter\'s Night a Traveller',
author: 'Calvino, Italo'
}]
});
update
and destroy
modify only the origin relation, but can apply criteria which filter based on joined relations.
const librariesWithBooks = db.libraries.join('books');
const withCabell = await librariesWithBooks.update({
'books.author ilike': 'cabell, %'
}, {
has_cabell: true
});
// note that following a foreign key in the opposite
// direction -- deleting libraries based on a patron or
// book, where patrons or books have `library_id`s --
// will fail since the foreign key constraint would be
// violated
const libraryMembers = db.patrons.join('libraries');
const iplPatrons = await libraryMembers.destroy({
'libraries.name ilike': 'Imaginary Public Library'
});
save
continues to function either as insert
or update
, depending on whether a primary key value is supplied for the origin relation. However, it is less useful since inserts can target multiple tables à la deep insert while updates cannot.
Caveats
INNER
andLEFT OUTER
are the only fully supported join types. Fortunately, they're also by far the most commonly used join types. The principal reason for this is that decomposition depends on having non-null tuples for the origin relation in all records, which neitherRIGHT OUTER
norFULL OUTER
joins guarantee.CROSS JOIN
, being rather rare in application contexts, has simply not been implemented.findOne
is useless with one:many joins because it applies aLIMIT 1
clause to all involved relations, and so will discard supernumerary results.update
anddestroy
cannot be used if a joined relation other than the origin has implicit or expliciton
conditions which reference the origin table, due to limitations in Postgres itself.count
is unreliable with one:many joins. Count the origin relation alone, or use SQL for inner joins and filtering by non-origin fields.- The
fields
option cannot be used to restrict the resultset of join queries, since it destroys the aliasing used to track columns. - The
limit
option is unreliable with one:many joins, having especially dire consequences foroffset
-limit
pagination.
Standalone Resultset Decomposition
You may already have or be designing a view, script, query, or database function which compiles results from multiple relations, and want to transform data from an existing Massive API endpoint in the same manner as compound entities structure output data into trees. This is supported through the decompose
query option, which indicates a schema similar to that passed to Readable.join()
.
Property | Type | Description |
---|---|---|
pk | String/Array | Specify a returned field or fields uniquely identifying a single record for this node in the decomposition schema. |
columns | Array/Object | An array of field names, or an object mapping returned field names (keys) to their final names in the output. |
decomposeTo | String | Nested values are ordinarily decomposed into arrays. Set to object to decompose them into a singular object instead. |
Any other key on a schema is taken to represent a nested schema, and nested schemas may not be named with one of the reserved keys. If a schema contains both a column and a descendant schema with the same name, the descendant schema wins and overwrites the column value in the final output.
The following schema:
const usersWithTests = await db.user_tests.find({}, {
decompose: {
pk: 'user_id',
columns: ['user_id', 'login'],
tests: {
pk: 'test_id',
columns: { test_id: 'id', name: 'name' }
}
}
});
will transform this recordset:
[{ user_id: 1, login: 'alice', test_id: 1, name: 'one' },
{ user_id: 1, login: 'alice', test_id: 2, name: 'two' },
{ user_id: 2, login: 'bob', test_id: 3, name: 'three' }]
into this:
[{
user_id: 1,
login: 'alice',
tests: [{ id: 1, name: 'one' },
{ id: 2, name: 'two' }]
}, {
user_id: 2,
login: 'bob',
tests: [{ id: 3, name: 'three' }]
}]
This can also be used with raw SQL through db.query
. Note that options need to be passed as the third argument, as the params array is still required in the second argument.
const usersWithTests = await db.query(
`select u.id as u_id, u.name as u_name,
u.address as u_address,
t.id as t_id, t.score as t_score
from users u
inner join tests t on t.user_id = u.id`,
[],
{
decompose: {
pk: 'id',
columns: {
u_id: 'id',
u_name: 'name',
u_address: 'address'
},
tests: {
pk: 't_id',
columns: {t_id: 'id', t_score: 'score'}
}
}
}
);
The decompose
option can be applied to any result set, although it will generally be most useful with views and scripts.