Query group traits
Metadata
- Author: nikomatsakis
- Date: 2019-01-15
- Introduced in: https://github.com/salsa-rs/salsa-rfcs/pull/1
Motivation
- Support
dyn QueryGroup
for each query group trait as well asimpl QueryGroup
dyn QueryGroup
will be much more convenient, at the cost of runtime efficiency
- Don't require you to redeclare each query in the final database, just the query groups
User's guide
Declaring a query group
User's will declare query groups by decorating a trait with salsa::query_group
:
#[salsa::query_group(MyGroupStorage)]
trait MyGroup {
// Inputs are annotated with `#[salsa::input]`. For inputs, the final trait will include
// a `set_my_input(&mut self, key: K1, value: V1)` method automatically added,
// as well as possibly other mutation methods.
#[salsa::input]
fn my_input(&self, key: K1) -> V1;
// "Derived" queries are just a getter.
fn my_query(&self, key: K2) -> V2;
}
The query_group
attribute is a procedural macro. It takes as
argument the name of the storage struct for the query group --
this is a struct, generated by the macro, which represents the query
group as a whole. It is attached to a trait definition which defines the
individual queries in the query group.
The macro generates three things that users interact with:
- the trait, here named
MyGroup
. This will be used when writing the definitions for the queries and other code that invokes them. - the storage struct, here named
MyGroupStorage
. This will be used later when constructing the final database. - query structs, named after each query but converted to camel-case
and with the word query (e.g.,
MyInputQuery
formy_input
). These types are rarely needed, but are presently useful for things like invoking the GC. These types violate our rule that "things the user needs to name should be given names by the user", but we choose not to fully resolve this question in this RFC.
In addition, the macro generates a number of structs that users should not have to be aware of. These are described in the "reference guide" section.
Controlling query modes
Input queries, as described in the trait, are specified via the
#[salsa::input]
attribute.
Derived queries can be customized by the following attributes,
attached to the getter method (e.g., fn my_query(..)
):
#[salsa::invoke(foo::bar)]
specifies the path to the function to invoke when the query is called (default ismy_query
).#[salsa::volatile]
specifies a "volatile" query, which is assumed to read untracked input and hence must be re-executed on every revision.#[salsa::dependencies]
specifies a "dependencies-only" query, which is assumed to read untracked input and hence must be re-executed on every revision.
Creating the database
Creating a salsa database works by using a #[salsa::database(..)]
attribute. The ..
content should be a list of paths leading to the
storage structs for each query group that the database will
implement. It is no longer necessary to list the individual
queries. In addition to the salsa::database
query, the struct must
have access to a salsa::Runtime
and implement the salsa::Database
trait. Hence the complete declaration looks roughly like so:
#[salsa::database(MyGroupStorage)]
struct MyDatabase {
runtime: salsa::Runtime<MyDatabase>,
}
impl salsa::Database for MyDatabase {
fn salsa_runtime(&self) -> salsa::Runtime<MyDatabase> {
&self.runtime
}
}
This (procedural) macro generates various impls and types that cause
MyDatabase
to implement all the traits for the query groups it
supports, and which customize the storage in the runtime to have all
the data needed. Users should not have to interact with these details,
and they are written out in the reference guide section.
Reference guide
The goal here is not to give the full details of how to do the
lowering, but to describe the key concepts. Throughout the text, we
will refer to names (e.g., MyGroup
or MyGroupStorage
) that appear
in the example from the User's Guide -- this indicates that we use
whatever name the user provided.
The plumbing::QueryGroup
trait
The QueryGroup
trait is a new trait added to the plumbing module. It
is implemented by the query group storage struct MyGroupStorage
. Its
role is to link from that struct to the various bits of data that the
salsa runtime needs:
pub trait QueryGroup<DB: Database> {
type GroupStorage;
type GroupKey;
}
This trait is implemented by the storage struct (MyGroupStorage
)
in our example. You can see there is a bit of confusing nameing going
on here -- what we call (for user's) the "storage struct" actually
does not wind up containing the true storage (that is, the hasmaps
and things salsa uses). Instead, it merely implements the QueryGroup
trait, which has associated types that lead us to structs we need:
- the group storage contains the hashmaps and things for all the queries in the group
- the group key is an enum with variants for each of the queries. It basically stores all the data needed to identify some particular query value from within the group -- that is, the name of the query, plus the keys used to invoke it.
As described further on, the #[salsa::query_group]
macro is
responsible will generate an impl of this trait for the
MyGroupStorage
struct, along with the group storage and group key
type definitions.
The plumbing::HasQueryGroup<G>
trait
The HasQueryGroup<G>
struct a new trait added to the plumbing
module. It is implemented by the database struct MyDatabase
for
every query group that MyDatabase
supports. Its role is to offer
methods that move back and forth between the context of the full
database to the context of an individual query group:
pub trait HasQueryGroup<G>: Database
where
G: QueryGroup<Self>,
{
/// Access the group storage struct from the database.
fn group_storage(db: &Self) -> &G::GroupStorage;
/// "Upcast" a group key into a database key.
fn database_key(group_key: G::GroupKey) -> Self::DatabaseKey;
}
Here the "database key" is an enum that contains variants for each group. Its role is to take group key and puts it into the context of the entire database.
The Query
trait
The query trait (pre-existing) is extended to include links to its group, and methods to convert from the group storage to the query storage, plus methods to convert from a query key up to the group key:
pub trait Query<DB: Database>: Debug + Default + Sized + 'static {
/// Type that you you give as a parameter -- for queries with zero
/// or more than one input, this will be a tuple.
type Key: Clone + Debug + Hash + Eq;
/// What value does the query return?
type Value: Clone + Debug;
/// Internal struct storing the values for the query.
type Storage: plumbing::QueryStorageOps<DB, Self> + Send + Sync;
/// Associate query group struct.
type Group: plumbing::QueryGroup<
DB,
GroupStorage = Self::GroupStorage,
GroupKey = Self::GroupKey,
>;
/// Generated struct that contains storage for all queries in a group.
type GroupStorage;
/// Type that identifies a particular query within the group + its key.
type GroupKey;
/// Extact storage for this query from the storage for its group.
fn query_storage(group_storage: &Self::GroupStorage) -> &Self::Storage;
/// Create group key for this query.
fn group_key(key: Self::Key) -> Self::GroupKey;
}
Converting to/from the context of the full database generically
Putting all the previous plumbing traits together, this means that given:
- a database
DB
that implementsHasGroupStorage<G>
; - a group struct
G
that implementsQueryGroup<DB>
; and, - and a query struct
Q
that implementsQuery<DB, Group = G>
we can (generically) get the storage for the individual query
Q
out from the database db
via a two-step process:
let group_storage = HasGroupStorage::group_storage(db);
let query_storage = Query::query_storage(group_storage);
Similarly, we can convert from the key to an individual query up to the "database key" in a two-step process:
let group_key = Query::group_key(key);
let db_key = HasGroupStorage::database_key(group_key);
Lowering query groups
The role of the #[salsa::query_group(MyGroupStorage)] trait MyGroup { .. }
macro is primarily to generate the group storage struct and the
impl of QueryGroup
. That involves generating the following things:
- the query trait
MyGroup
itself, but with:salsa::foo
attributes stripped#[salsa::input]
methods expanded to include setters:fn set_my_input(&mut self, key: K1, value__: V1);
fn set_constant_my_input(&mut self, key: K1, value__: V1);
- the query group storage struct
MyGroupStorage
- We also generate an impl of
QueryGroup<DB>
forMyGroupStorage
, linking to the internal strorage struct and group key enum
- We also generate an impl of
- the individual query types
- Ideally, we would use Rust hygiene to hide these struct, but as
that is not currently possible they are given names based on the
queries, but converted to camel-case (e.g.,
MyInputQuery
andMyQueryQuery
). - They implement the
salsa::Query
trait.
- Ideally, we would use Rust hygiene to hide these struct, but as
that is not currently possible they are given names based on the
queries, but converted to camel-case (e.g.,
- the internal group storage struct
- Ideally, we would use Rust hygiene to hide this struct, but as
that is not currently possible it is entitled
MyGroupGroupStorage<DB>
. Note that it is generic with respect to the databaseDB
. This is because the actual query storage requires sometimes storing database key's and hence we need to know the final database type. - It contains one field per query with a link to the storage information
for that query:
my_query: <MyQueryQuery as salsa::plumbing::Query<DB>>::Storage
- (the
MyQueryQuery
type is also generated, see the "individual query types" below)
- The internal group storage struct offers a public, inherent method
for_each_query
:fn for_each_query(db: &DB, op: &mut dyn FnMut(...)
- this is invoked by the code geneated by
#[salsa::database]
when implementing thefor_each_query
method of theplumbing::DatabaseOps
trait
- Ideally, we would use Rust hygiene to hide this struct, but as
that is not currently possible it is entitled
- the group key
- Again, ideally we would use hygiene to hide the name of this struct,
but since we cannot, it is entitled
MyGroupGroupKey
- It is an enum which contains one variant per query with the value being the key:
my_query(<MyQueryQuery as salsa::plumbing::Query<DB>>::Key)
- The group key enum offers a public, inherent method
maybe_changed_after
:fn maybe_changed_after<DB>(db: &DB, db_descriptor: &DB::DatabaseKey, revision: Revision)
- it is invoked when implementing
maybe_changed_after
for the database key
- Again, ideally we would use hygiene to hide the name of this struct,
but since we cannot, it is entitled
Lowering database storage
The #[salsa::database(MyGroup)]
attribute macro creates the links to the query groups.
It generates the following things:
- impl of
HasQueryGroup<MyGroup>
forMyDatabase
- Naturally, there is one such impl for each query group.
- the database key enum
- Ideally, we would use Rust hygiene to hide this enum, but currently
it is called
__SalsaDatabaseKey
. - The database key is an enum with one variant per query group:
MyGroupStorage(<MyGroupStorage as QueryGroup<MyDatabase>>::GroupKey)
- Ideally, we would use Rust hygiene to hide this enum, but currently
it is called
- the database storage struct
- Ideally, we would use Rust hygiene to hide this enum, but currently
it is called
__SalsaDatabaseStorage
. - The database storage struct contains one field per query group, storing
its internal storage:
my_group_storage: <MyGroupStorage as QueryGroup<MyDatabase>>::GroupStorage
- Ideally, we would use Rust hygiene to hide this enum, but currently
it is called
- impl of
plumbing::DatabaseStorageTypes
forMyDatabase
- This is a plumbing trait that links to the database storage / database key types.
- The
salsa::Runtime
uses it to determine what data to include. The query types use it to determine a database-key.
- impl of
plumbing::DatabaseOps
forMyDatabase
- This contains a
for_each_query
method, which is implemented by invoking, in turn, the inherent methods defined on each query group storage struct.
- This contains a
- impl of
plumbing::DatabaseKey
for the database key enum- This contains a method
maybe_changed_after
. We implement this by matching to get a particular group key, and then invoking the inherent method on the group key struct.
- This contains a method
Alternatives
This proposal results from a fair amount of iteration. Compared to the status quo, there is one primary downside. We also explain a few things here that may not be obvious.
Why include a group storage struct?
You might wonder why we need the MyGroupStorage
struct at all. It is a touch of boilerplate,
but there are several advantages to it:
- You can't attach associated types to the trait itself. This is because the "type version"
of the trait (
dyn MyGroup
) may not be available, since not all traits are dyn-capable. - We try to keep to the principle that "any type that might be named
externally from the macro is given its name by the user". In this
case, the
[salsa::database]
attribute needed to name group storage structs.- In earlier versions, we tried to auto-generate these names, but
this failed because sometimes users would want to
pub use
the query traits and hide their original paths. - (One exception to this principle today are the per-query structs.)
- In earlier versions, we tried to auto-generate these names, but
this failed because sometimes users would want to
- We expect that we can use the
MyGroupStorage
to achieve more encapsulation in the future. While the struct must be public and named from the database, the trait (and query key/value types) actually does not have to be.
Downside: Size of a database key
Database keys now wind up with two discriminants: one to identify the
group, and one to identify the query. That's a bit sad. This could be
overcome by using unsafe code: the idea would be that a group/database
key would be stored as the pair of an integer and a union
. Each
group within a given database would be assigned a range of integer
values, and the unions would store the actual key values. We leave
such a change for future work.
Future possibilities
Here are some ideas we might want to do later.
No generics
We leave generic parameters on the query group trait etc for future work.
Public / private
We'd like the ability to make more details from the query groups private. This will require some tinkering.
Inline query definitions
Instead of defining queries in separate functions, it might be nice to have the option of defining query methods in the trait itself:
#[salsa::query_group(MyGroupStorage)]
trait MyGroup {
#[salsa::input]
fn my_input(&self, key: K1) -> V1;
fn my_query(&self, key: K2) -> V2 {
// define my-query right here!
}
}
It's a bit tricky to figure out how to handle this, so that is left for future work. Also, it would mean that the method body itself is inside of a macro (the procedural macro) which can make IDE integration harder.
Non-query functions
It might be nice to be able to include functions in the trait that are
not queries, but rather helpers that compose queries. This should be
pretty easy, just need a suitable #[salsa]
attribute.