GRANDstack Access Control - Query Transformations

imkleats

Ian Kleats

Posted on February 23, 2020

GRANDstack Access Control - Query Transformations

Welcome back to this exploratory series on discretionary access control with the GRANDstack! First off, I need to fess up about something.

I lied to you in the last article. I told you we were going to jump right into crafting a schema directive. We are not. That's because I didn't want to have lied again.

I told you this series would assume "some basic familiarity with GraphQL concepts." We are actually going to be digging into certain parts of the GraphQL reference implementation that you might never see even if you were highly proficient in developing GraphQL backends.

Hold up. Can't you just use some Apollo tooling to do a query document transformation and skip this? Probably for this use-case, but I'm not going to take that route.

It's selfish, really. I have a pattern for document transformations that I want to riff on because I believe it will elegantly solve some problems when we move on to mutations. I don't want to throw this pattern at you without giving you some background knowledge, though.

Where do we start?

Let's start at the beginning. Take a look at the GraphQL JavaScript reference implementation's Getting Started section. Notice how the "Hello World" response is generated:

// Run the GraphQL query '{ hello }' and print out the response
graphql(schema, '{ hello }', root).then((response) => {
  console.log(response);
});
Enter fullscreen mode Exit fullscreen mode

Ok, so we can see that there is an argument for 'schema' and 'root'. With GRANDstack, both of these are taken care of by makeAugmentedSchema from neo4j-graphql-js, so let's ignore 'em for now and maybe later too.

The middle argument is a query string. Our end goal is to stifle to machinations of your nosy neighbor nemesis, Bob. We talked about how he could circumvent the filter arguments by submitting his own queries that didn't include them. Let's see where that rabbit hole leads.

If we click on the API reference link for the graphql function, we'd find this description:

graphql
graphql(
  schema: GraphQLSchema,
  requestString: string,
  rootValue?: ?any,
  contextValue?: ?any,
  variableValues?: ?{[key: string]: any},
  operationName?: ?string
): Promise<GraphQLResult>

The graphql function lexes, parses, validates and executes a GraphQL request. It requires a schema and a requestString. Optional arguments include a rootValue, which will get passed as the root value to the executor, a contextValue, which will get passed to all resolve functions, variableValues, which will get passed to the executor to provide values for any variables in requestString, and operationName, which allows the caller to specify which operation in requestString will be run, in cases where requestString contains multiple top-level operations.

And you may ask yourself How do I work this?

We've pulled back a layer of the GraphQL onion and found that there are four primary concerns for the main entrypoint to the reference implementation: lexing, parsing, validating, and executing. BUT WHAT DOES IT MEAN? Let's dig in to each of those at a high level.

  • Lexing turns the strings into tokens that are used by the parser.
  • Parsing turns the tokens from the lexer into a Document AST.
  • Validating traverses the Document AST to ensure proper AST structure and enforce the type system.
  • Executing executes the validated Document AST.

So, if you had the "basic familiarity with GraphQL concepts" I was assuming last article, you have probably not spent much time in the graphql/language module that is pivotal to those first three concerns. Let's change that.

Fun with Parsing

Have you heard about AST explorer (site and github)? It's a'ight, you know, if you like being able to see how your GraphQL queries get parsed into Document ASTs. We can go ahead and copy over the query we came up with last time.

query aclTasks($user_id: ID!){
  Task(filter: {visibleTo_some: {userId: $user_id}}) {
    taskId
    name
    details
  }
}
Enter fullscreen mode Exit fullscreen mode

Cool! Take a few minutes, hours, days, or weeks to wrap your head around what your queries become. Play around with it. Parsing works with more than query/mutation operations. Try throwing your type, directive, and schema definitions at it, too.

Depending how deep down the rabbit hole you want to go, you can consult a mix of the GraphQL Specification and the actual definitions of AST nodes in the JavaScript reference implementation.

Back to business

Alright, what did we notice? Here are a few of my takeaways:

  • The root node of whatever you're parsing is the DocumentNode, and its only children are DefinitionNodes in an array labeled definitions.
  • Our queries, mutations, and subscriptions show up as OperationDefinition nodes.
  • Some of the arguments from graphql() make a little more sense. For instance, if you add multiple query or mutation blocks, you see more than one OperationDefinition nodes. Your executor needs you to tell it which one to run.
    • This could be pretty cool down the road. Imagine what we might do if we could define and use extraneous query blocks for some other purpose in the background or even as inputs into resolving the primary operation? IMAGINE! That might be a topic for another series.
  • The first selectionSet within the OperationDefinition will hold Fields that are representative of the fields defined in our schema's root Query, Mutation, and Subscription types.
  • Each Field has an optional attribute of arguments, which contains an array of ArgumentNodes. This is where our filter arguments show up.
  • The value of our filter arguments are of type ObjectFieldNode, which are a kind of key-value data structure. The keys of these objects are NameNodes, and the values are ValueNodes. Complex filter arguments might be nested several levels deep.
  • Our OperationDefinition nodes don't give us any schema-related type info for the Fields it contains. If we want to define a schema directive on our type definitions to trigger this filter behavior, we are going to have to find a way to somehow access that type info.

Thinking About a Potential Implementation

We're getting very close to fully conceptualizing the steps that will need to occur in the implementation of our discretionary access control directive. Let's lay them out.

  1. By looking at the internals of neo4jgraphql, we can see it uses the resolveInfo argument. That thing seems to have the pieces we need to get this done.
    • We could use the resolveInfo from the resolver functions, or we could preemptively create the parts we need by applying middleware that somehow feeds into the resolver context.
  2. GraphQL queries can be written in all sorts of shapes, sizes, and permutations. That's kinda the point. We're going to need some sort of recursion to hit all relevant parts of the OperationDefinition.
    • Bad Joke Break: What did the recursive process say to the AST? I'll get to the bottom of this!
  3. As we're traversing, we could create a parallel OperationDefinition AST with modified filter arguments. We can use the schema field of resolveInfo to identify which types have the schema directive we'd like to indicate this behavior.
  4. Replace the old operation value of resolveInfo with the transformed OperationDefinition node when passing it to neo4jgraphql in your root resolvers, and let neo4jgraphql do its thing without interference.

Saving yourself some work

Hey! You know who's lazy? Me.

It turns out that #2 and #3 are problems that have already been solved. Remember how I said:

Validating traverses the Document AST to ensure proper AST structure and enforce the type system.

Sounds kinda, sorta, a little bit like what we're wanting to do, no? Let's put it side-by-side.

  • Validation traverses the AST, examines the contents of each node relative to the type system, identifies features that need to exist or not exist in each node, and collects a record of that identification in the form of error values.
  • Transformation traverses the AST, examines the contents of each node relative to the type system, identifies features that need to exist or not exist in each node, and collects a record of that identification in the form of modified nodes.

Yep. Checks out to me. Let's take a look, and...

That might just work!

Now we circle back to the comments I made up top about being a little selfish by not just using some existing Apollo tooling. I've taken the liberty of porting over the validation implementation to a transformation context.

GitHub logo imkleats / graphql-ast-tools

Rule-based translation of GraphQL Document ASTs to ASTs of other query languages

This is the pattern I'm going to use to implement our filter argument transformations next time. At a very high level:

  • It uses visit() for depth first traversal, visitWithTypeInfo() for access to the type info from our schema, and visitInParallel() to run multiple visitor functions.
  • These visitor functions allow for separation of concerns within and across certain kinds of AST nodes.
  • Instead of collecting an array of error values, we can collect pieces of a transformed AST in a map that allows for lazy evaluation once traversal is complete.

The road goes ever on and on.

Thanks for joining me on this foray into some GraphQL concepts and implementation details that you might never have wanted to see! We've gone end-to-end to identify some key considerations in query transformation, and I've introduced the structure of a solution I will continue fleshing out.

Now, when we start building the transformation rules and visitor functions we need, I hope you're able to understand what we're doing and why we're doing it. Till next time!

💖 💪 🙅 🚩
imkleats
Ian Kleats

Posted on February 23, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related