Query Fragments: base concepts

This chapter describes the base interfaces for query Fragments and related concepts. It is mostly intended for those wanting to create custom Fragments and / or wishing to understand the inner workings of the package.

Fragments overview

Every query being built by the package goes through the following steps

  • TableGateway / SelectProxy generates a cache key for the query and calls TableLocator::createNativeStatementUsingCache() passing a factory closure and the key.

  • If there is a cached NativeStatement instance with that key, that is returned.

  • Otherwise, a query is created:

    • A factory closure is called, it creates a base query AST (e.g. SELECT self.* from table_name as self) and then applies Fragments to it.

    • The resultant Statement (i.e. AST) is converted to NativeStatement (i.e. SQL with additional types metadata) and possibly cached.

Fragments are responsible for two of the above steps:

  • generating the key (indirectly) and

  • modifying the AST (directly).

They should serve as a sort of proxy for a part of Statement AST, creating the actual Node objects only when the applyTo() method is called. E.g. if an instance of Fragment contains some manually written SQL as a string, that string should not be processed by Parser unless applyTo() is called. The string should be used for generating a unique key for the Fragment, though.

Fragments should generally be independent and reusable. It is possible to set up a dependency between Fragments via priority, but without explicit priority it should be assumed that the fragments can be applied in any order and to any Statement. It is a job of a Fragment to check whether it can be applied in the first place.

KeyEquatable interface

namespace sad_spirit\pg_gateway;

interface KeyEquatable
{
    public function getKey(): ?string;
}

getKey() method of an implementing object should return a string that uniquely identifies this object based on its properties. Returning null means that the object (and consequently the query using it) cannot be cached.

Implementations of this interface are considered “equal” when building a query if their keys are equal. This is used for two main purposes:

  • We need a means to generate cache key for a query without generating SQL itself. Cache key for a complete statement will be generated based on values returned by getKey() methods of its Fragments.

  • FragmentList discards duplicate fragments (= having equal string keys): those may appear when several Fragments add the same Fragment (e.g. a CTE or a join to a related table) as a dependency.

An implementation should only return a non-null key if

  • It is immutable, receiving all its dependencies in the constructor;

  • It will always apply the same changes to the same given Statement.

The key should depend on SQL being generated but never on values of parameters used in the query, even if those are passed with the Fragment.

If an implementation of KeyEquatable has a property that also implements KeyEquatable then it should

  • Return null from getKey() to prevent caching if the property’s getKey() returns null.

  • Generate a key depending on the child’s string key otherwise.

E.g. for WhereClauseFragment containing a Condition:

public function getKey(): ?string
{
    $conditionKey = $this->condition->getKey();
    return null === $conditionKey ? null : 'where.' . $conditionKey;
}

Fragment interface

namespace sad_spirit\pg_gateway;

use sad_spirit\pg_builder\Statement;

interface Fragment extends KeyEquatable
{
    public function applyTo(Statement $statement) : void;
    public function getPriority() : int;
}

Implementations of Fragment are classes that directly modify the Statement being built. They, however, are not necessarily generating the actual changes, delegating this instead to some other classes. E.g. WhereClauseFragment uses an expression generated by Condition.

In any case, as stated above, Fragment implementations should delay building parts of AST until applyTo() is actually called.

getPriority() returns the fragment’s priority. Fragments with higher priority will be applied earlier, this may be relevant for CTEs, joins, and parts of ORDER BY / GROUP BY clauses. If fragments have the same priority then they will be applied in alphabetical order of their keys.

Fragment has several constants defined for priority values, but it is not necessary to use these, any int value will do.

SelectFragment interface

This is an interface for fragments specific to SELECT statements. As SelectProxy can actually execute two different queries using the same set of fragments:

  • SELECT * query executed in getIterator() and

  • SELECT COUNT(*) query executed in executeCount()

we need a means to filter parts of the query that are not relevant to the latter.

namespace sad_spirit\pg_gateway;

use sad_spirit\pg_builder\Statement;

interface SelectFragment extends Fragment
{
    public function applyTo(Statement $statement, bool $isCount = false): void;
    public function isUsedForCount(): bool;
}

isUsedForCount() returns whether this fragment should be added to a SELECT COUNT(*) query at all. If the fragment does not change the number of returned rows or if it doesn’t make sense for SELECT COUNT(*) query (e.g. ORDER, LIMIT, OFFSET), then it should be skipped.

The $isCount parameter for applyTo() specifies whether a SELECT COUNT(*) query is being processed. It is intended for the JOIN-type fragments: while the join itself may be needed as it affects the number of returned rows, adding fields from the joined table to the target list should be omitted.

FragmentBuilder interface

namespace sad_spirit\pg_gateway;

interface FragmentBuilder
{
    public function getFragment(): Fragment;
}

The getFragment() method of an implementing class should return the built fragment.

This interface has two main purposes

  • As Fragment instances should be immutable, they may need a lot of complex constructor arguments. It is much easier to use a builder with a fluent interface than to create these manually.

  • Fragment dependencies that are not instances of Fragment can implement FragmentBuilder to be accepted by gateway’s query methods.

The first purpose is easily illustrated with JoinBuilder:

use sad_spirit\pg_gateway\builders\JoinBuilder;
use sad_spirit\pg_gateway\metadata\TableName;

// join() returns a subclass of JoinBuilder that will also proxy methods of $documentsBuilder
$documentsBuilder->join(new TableName('employees'))
   ->onForeignKey(['author_id'])
   ->left()
   ->alias('author')
   ->useForCount(false);

The second one is most obvious with Condition that wraps itself in WhereClauseFragment in its getFragment() implementation:

public function getFragment(): Fragment
{
    return new fragments\WhereClauseFragment($this);
}

which allows passing Condition instances directly to gateway’s query methods, adding them to the WHERE clause of the query being built.

Passing parameter values with Fragments

If a Fragment adds a parameter placeholder :param to the query, it may make sense to pass a value for that parameter alongside the fragment. It is essentially required for Fragments that embed a SelectProxy into the larger statement, as SelectProxy implementations should contain all parameters needed to execute a query.

Parametrized interface

namespace sad_spirit\pg_gateway;

interface Parametrized
{
    public function getParameterHolder(): ParameterHolder;
}

The getParameterHolder() method of the implementing class should return values for named parameters.

We do not return just an associative array, as we want to perform an additional check when combining parameter values from several sources: there should not be several values for the same parameter name. This check is performed by an implementation of ParameterHolder interface.

Most of the built-in Fragment implementations do actually implement Parametrized.

ParameterHolder interface

namespace sad_spirit\pg_gateway;

interface ParameterHolder
{
    public function getOwner() : KeyEquatable;
    public function getParameters() : array<string, mixed>;
}
getParameters()

Returns an associative array of parameter values.

getOwner()

Returns the Fragment/Condition (or other implementation of KeyEquatable) that is the source of the parameters.

This is only used for generating an exception message if different values for the same parameter were found when combining several ParameterHolder fragments

ParameterHolder has three implementations:

holders\EmptyParameterHolder

This is a Null Object implementation, its getParameters() method always returns [].

holders\SimpleParameterHolder

A wrapper for an associative array, returned when parameter values come from a single source.

holders\RecursiveParameterHolder

Aggregates several child ParameterHolders, returned by Fragments that have several Parametrized children. This is the class that actually performs the check for duplicate values described above.

holders\ParameterHolderFactory class

namespace sad_spirit\pg_gateway\holders;

use sad_spirit\pg_gateway\{
    KeyEquatable,
    ParameterHolder
};

class ParameterHolderFactory
{
    public static function create(?KeyEquatable ...$maybeParametrized) : ParameterHolder;
}

This helper class is used when an object has several properties that may or may not be Parametrized (or may be even null). Given a list of such properties, create() method returns a proper implementation of ParameterHolder.

Consider fragments\JoinFragment::getParameterHolder():

public function getParameterHolder(): ParameterHolder
{
    return ParameterHolderFactory::create($this->joined, $this->condition);
}

$joined may be an instance of SqlStringSelectBuilder which does not implement Parametrized and $condition may be missing altogether. This method may legitimately return any implementation of ParameterHolder.

Using table aliases

As a rule of thumb, all tables that appear in the queries should be aliased. This allows using generated queries in join-type Fragments without possible ambiguities and allows using the same fragment with different gateways.

There are two specially handled aliases

  • self (TableGateway::ALIAS_SELF) - alias for the table handled by the current gateway. All the fragments passed to its query methods should use this alias for access to the table columns.

  • joined (TableGateway::ALIAS_JOINED) - this is a special alias used in the join conditions, it references the table being joined (while self alias references the base table as usual).

As seen throughout the docs, above aliases should be used even in custom SQL fragments:

$builder->sqlCondition(
    "current_date between coalesce(self.valid_from, 'yesterday') and coalesce(self.valid_to, 'tomorrow')"
);

Join-type fragments usually allow specifying an explicit alias for the table being joined. If not given, an automatically generated one will be used.

walkers\ReplaceTableAliasWalker class

This class is used internally by JoinFragment and similar classes to replace the above two aliases by some custom ones:

$select->dispatch(new ReplaceTableAliasWalker(TableGateway::ALIAS_SELF, $alias));

$condition->dispatch(new ReplaceTableAliasWalker(TableGateway::ALIAS_JOINED, $alias));

As it is working with query AST, it will replace aliases even in fragments that were originally added as SQL strings.