GraphQL is not Terraform
Stefan 🚀
Posted on October 18, 2022
WunderGraph Cloud Early Access
Before we get into the blog post. WunderGraph Cloud is being released very soon. We’re looking for Alpha and Beta testers for WunderGraph Cloud.
Testers will receive early access to WunderGraph Cloud and 3 months Cloud Pro for free.
More and more tools and solutions use GraphQL as a configuration language.
It's almost 3 years ago since I experimented with this approach myself but quickly abandoned the idea.
This post will go through a number of examples and explains why I think GraphQL is not great for configuration.
At WunderGraph, we use TypeScript as our primary configuration language.
I think it's a great fit to make configuring applications intuitive, self-explanatory and type-safe.
Other alternatives are YAML, TOML, HCL and JSON,
but let's start with GraphQL.
Falling in love with GraphQL way too much
I really enjoy working with GraphQL.
It's probably the best language to "Query" pre-defined object trees.
It's not really a Graph Database Query Language, like Cypher, but we still love it.
I'd call GraphQL more a JSON Query Language, actually.
The specification doesn't really mention JSON,
but most of us use GraphQL to query JSON "trees".
When you're working with APIs that produce JSON,
GraphQL is really great to get subsets of the data you need.
Other API tools, like gRPC or REST lack "Selection Sets",
making it not as intuitive to get a subset of the whole data tree.
And even if your APIs don't produce JSON,
or have more cumbersome protocols, like Kafka or gRPC,
you can still use GraphQL on top of them with a tool like WunderGraph.
Anyway, GraphQL is great for querying JSON APIs.
But what about configuration?
It's now almost 4 years ago since I started "implementing" GraphQL in Go.
When experimenting with a new tool,
it's obvious that you want to use it everywhere,
and so I did.
One year after I started implementing the language itself,
I've built a GraphQL Gateway that uses GraphQL SDL (Schema Definition Language) as the primary configuration language.
As you can see, the project is archived and not maintained anymore.
It was a great experiment,
but the language is just too limited to be used as a configuration language.
Next, I'll go through some examples to illustrate the problems.
GraphQL is lacking imports
If you look at my example configuration,
you'll notice that the first 102 lines are just declarations of directives and types.
You have to put them somewhere,
because otherwise your IDE won't be able to give you intellisense.
GraphQL, by default, has no support for imports.
This leads to hacks like this:
# import '../fragments/UserData.graphql'
query GetUser {
user(id: 1) {
# ...UserData
email
}
}
If you search the GraphQL specification for the keyword "import",
you'll find that it's not mentioned anywhere.
Have a look at this example from Apollo Federation 2.
Where is the actual schema?
At first glance, how does this GraphQL API look like?
It's not obvious at all.
Another issue you'll run into is that you have to copy-paste the "declarations".
How do you verify if the declarations actually match your runtime?
Once the runtime changes, do you expect the developer to update the declarations manually?
Abuse of GraphQL Directives
In my opinion, directives are great when used with GraphQL Operations,
but quickly turn a GraphQL Schema into a mess when used for configuration.
Here's a good example:
mutation ($email: String!) @rbac(requireMatchAll: [superadmin]) {
deleteManyMessages(where: { users: { is: { email: { equals: $email } } } }) {
count
}
}
To be allowed to execute this mutation,
you have to be a superadmin, that's it.
We can immediately see that this is a mutation, and it's about deleting messages.
The directive doesn't get in the way of understanding the mutation.
Next, let's look at an awful example of using directives, it's from my GraphQL Gateway experiments:
type Query {
post(id: Int!): JSONPlaceholderPost
@HttpJsonDataSource(
host: "jsonplaceholder.typicode.com"
url: "/posts/{{ .arguments.id }}"
)
@mapping(mode: NONE)
country(code: String!): Country
@GraphQLDataSource(
host: "countries.trevorblades.com"
url: "/"
field: "country"
params: [
{
name: "code"
sourceKind: FIELD_ARGUMENTS
sourceName: "code"
variableType: "String!"
}
]
)
person(id: String!): Person
@WasmDataSource(
wasmFile: "./person.wasm"
input: "{\"id\":\"{{ .arguments.id }}\"}"
)
@mapping(mode: NONE)
httpBinPipeline: String
@PipelineDataSource(
configFilePath: "./httpbin_pipeline.json"
inputJSON: """
{
"url": "https://httpbin.org/get",
"method": "GET"
}
"""
)
@mapping(mode: NONE)
}
How many root fields does this schema have? 4. But it's not obvious.
Four root fields should be 4 lines of code, not 37.
Keep in mind, this is a simple example.
We didn't even start adding directives for authentication, authorization, rate limiting, etc...
Currently, the ratio between root fields and configuration directives is almost 1:10.
Adding more logic to the configuration will only make it worse.
Here's another example from Apollo Federation 2:
type Product implements ProductItf & SkuItf
@join__implements(graph: INVENTORY, interface: "ProductItf")
@join__implements(graph: PRODUCTS, interface: "ProductItf")
@join__implements(graph: PRODUCTS, interface: "SkuItf")
@join__implements(graph: REVIEWS, interface: "ProductItf")
@join__type(graph: INVENTORY, key: "id")
@join__type(graph: PRODUCTS, key: "id")
@join__type(graph: PRODUCTS, key: "sku package")
@join__type(graph: PRODUCTS, key: "sku variation { id }")
@join__type(graph: REVIEWS, key: "id")
{
id: ID! @tag(name: "hi-from-products")
dimensions: ProductDimension @join__field(graph: INVENTORY, external: true) @join__field(graph: PRODUCTS)
delivery(zip: String): DeliveryEstimates @join__field(graph: INVENTORY, requires: "dimensions { size weight }")
sku: String @join__field(graph: PRODUCTS)
name: String @join__field(graph: PRODUCTS)
package: String @join__field(graph: PRODUCTS)
variation: ProductVariation @join__field(graph: PRODUCTS)
createdBy: User @join__field(graph: PRODUCTS)
hidden: String @join__field(graph: PRODUCTS)
reviewsScore: Float! @join__field(graph: REVIEWS, override: "products")
oldField: String @join__field(graph: PRODUCTS)
reviewsCount: Int! @join__field(graph: REVIEWS)
reviews: [Review!]! @join__field(graph: REVIEWS)
}
I wonder how you debug this schema when one of the @join__field
or @join__type
directives is wrong.
But even if we ignore the debugging part,
the heavy use of directives makes it impossible to understand the schema at a glance.
Here's an example from Federation 1:
type Product @key(fields: "id") {
id: ID!
name: String
price: Int
weight: Int
inStock: Boolean
shippingEstimate: Int @external
warehouse: Warehouse @requires(fields: "inStock")
}
The @key
, @external
, and @requires
directives might be a bit weird,
but at least it was readable.
GraphQL Directives force you to repeat yourself
Here's another example where I wanted to apply the rule of "mapping mode NONE" to all root fields:
type Query {
hello: String!
@StaticDataSource(
data: "World!"
)
@mapping(mode: NONE)
staticBoolean: Boolean!
@StaticDataSource(
data: "true"
)
@mapping(mode: NONE)
nonNullInt: Int!
@StaticDataSource(
data: "1"
)
@mapping(mode: NONE)
nullableInt: Int
@StaticDataSource(
data: null
)
@mapping(mode: NONE)
foo: Foo!
@StaticDataSource(
data: "{\"bar\": \"baz\"}"
)
@mapping(mode: NONE)
}
Wouldn't it be nice if I could apply default configurations to all root fields?
With a language like TypeScript, we could have a map function that applies the default configurations to all root fields.
In GraphQL, there's no such simple solution to tackle repetition.
GraphQL Directives are not composable
As you can see in the example above,
it would be quite handy if we could "compose" the @HttpJsonDataSource
directive with the @mapping
directive,
because in most cases, we need them together.
In a language like typescript, we could wrap the HttpJsonDataSource
function with the mapping
function:
const HttpJsonDataSourceWithMapping = (config: HttpJsonDataSourceConfig) => {
return mapping(HttpJsonDataSource(config));
};
With GraphQL, we always have to apply all directives separately.
It's repetitive and makes the schema harder to read.
GraphQL Directives are not type-safe
One of the things a lot of people praise about GraphQL is that it's type-safe.
When using GraphQL Directives for configuration,
you lose this type-safety as soon as it gets more complex.
Let's take a look at some examples.
I don't just want to rant about others, so I'll start with my own example:
type Query {
post(id: Int): JSONPlaceholderPost
@HttpJsonDataSource(
host: "jsonplaceholder.typicode.com"
url: "/posts/{{ .arguments.id }}"
)
}
In the URL, we use some weird templating syntax to inject the id
argument into the URL.
How do we know how to correctly write this template when it's just a string?
What if the id is null?
Let's add some more logic to enhance readability (sarcasm):
type Query {
post(id: Int): JSONPlaceholderPost
@HttpJsonDataSource(
host: "jsonplaceholder.typicode.com"
path: "/posts/{{ default 0 .arguments.id }}"
)
}
This example was weird, but still somewhat ok.
If you have to configure a GraphQL DataSource, it's getting more complex.
type Query {
country(code: String!): Country
@GraphQLDataSource(
host: "countries.trevorblades.com"
url: "/"
field: "country"
params: [
{
name: "code"
sourceKind: FIELD_ARGUMENTS
sourceName: "code"
variableType: "String!"
}
]
)
}
In this example, we have to "apply" the code
argument to the GraphQL datasource.
We have to reference it by name (name: "code"
), but this is also just a string without any type-safety.
Because we need to be able to apply some "mapping" logic, we also have to specify the sourceName
and variableType
,
which are also just strings.
All of this works, but it's very fragile and hard to debug.
But looking at Federation, it can actually get worse:
type Product implements ProductItf & SkuItf
@join__implements(graph: INVENTORY, interface: "ProductItf")
@join__implements(graph: PRODUCTS, interface: "ProductItf")
@join__implements(graph: PRODUCTS, interface: "SkuItf")
@join__implements(graph: REVIEWS, interface: "ProductItf")
@join__type(graph: INVENTORY, key: "id")
@join__type(graph: PRODUCTS, key: "id")
@join__type(graph: PRODUCTS, key: "sku package")
@join__type(graph: PRODUCTS, key: "sku variation { id }")
@join__type(graph: REVIEWS, key: "id")
{
id: ID! @tag(name: "hi-from-products")
dimensions: ProductDimension @join__field(graph: INVENTORY, external: true) @join__field(graph: PRODUCTS)
delivery(zip: String): DeliveryEstimates @join__field(graph: INVENTORY, requires: "dimensions { size weight }")
sku: String @join__field(graph: PRODUCTS)
name: String @join__field(graph: PRODUCTS)
package: String @join__field(graph: PRODUCTS)
variation: ProductVariation @join__field(graph: PRODUCTS)
createdBy: User @join__field(graph: PRODUCTS)
hidden: String @join__field(graph: PRODUCTS)
reviewsScore: Float! @join__field(graph: REVIEWS, override: "products")
oldField: String @join__field(graph: PRODUCTS)
reviewsCount: Int! @join__field(graph: REVIEWS)
reviews: [Review!]! @join__field(graph: REVIEWS)
}
Take a look at line 8,9 and 14.
We have to specify key
arguments to define joins between subgraphs.
If you look closely, you'll see that the value in the string argument is a SelectionSet.
From a technical point of view, it's an amazing solution to be able to select subgraph fields for a join using SelectionSets.
From a developer point of view, I don't think it's a great developer experience.
Starting with a solution (GraphQL) and then going back to the problem (configuration) limits your options.
A better workflow would be to start with the problem and then evaluate different solutions.
I think it's obvious that it can be done better,
but we have to take into account that there's a threshold for how far you can make people migrate to a new solution.
Federation 2 had to be similar enough to Federation 1 to make it easy to migrate.
Let's continue with some more examples:
extend type Product @key(fields: "id") {
id: ID! @external
inStock: Boolean!
}
This is a simple one. We're defining a key for the Product type.
Again, the value is a SelectionSet, although it's just a single field,
but still not autocompletion or type-safety.
Ideally, the IDE could tell us that allowed inputs for the fields
argument are,
which leads to the root cause of the problem.
GraphQL doesn't support Generics
If we were using a language with proper support for generics,
like TypeScript (surprise),
the @key
directive could inherit meta information from the Product
type it's attached to.
This way, we could make the fields
argument above type-safe.
Here's another example showing the lack of generics:
type Review @model {
id: ID!
rating: Int! @default(value: 5)
published: Boolean @default(value: false)
status: Status @default(value: PENDING_REVIEW)
}
enum Status {
PENDING_REVIEW
APPROVED
}
We can see that the @default
directive is used multiple times to set default values for different fields.
The problem is that this is actually invalid GraphQL.
The argument value
cannot be of type Int
, Boolean
and Status
at the same time.
Multiple problems lead to this issue.
First, directive locations are very limited.
You can define that a directive is allowed on the location FIELD_DEFINITION
,
but you cannot specify that it should only be allowed on Int
fields.
But even if we could do that, it would still be ambiguous because we'd have to define multiple @default
directives for different types.
So, ideally, we could leverage some sort of Polymorphism to define a single @default
directive that works for all types.
Unfortunately GraphQL doesn't support this use case and never might.
Here's an alternative how you could handle this:
type Review @model {
id: ID!
rating: Int! @defaultInt(value: 5)
published: Boolean @defaultBoolean(value: false)
status: Status @defaultStatus(value: PENDING_REVIEW)
}
This might now be a valid GraphQL Schema,
but there's another problem stemming from this approach.
We cannot enforce that the user puts @defaultInt
on an Int
field.
There's no constraint available in the GraphQL Schema language to enforce this.
TypeScript can easily do this.
While some workarounds are acceptable,
please don't use GraphQL in ways that generate invalid Schemas.
It breaks all your tooling, IDEs, linters, etc.
Don't put Ecmascript in GraphQL multiline strings
Here's another abomination:
type Query {
scriptExample(message: String!): JSON
@rest(
endpoint: "https://httpbin.org/anything"
method: POST
ecmascript: """
function bodyPOST(s) {
let body = JSON.parse(s);
body.ExtraMessage = get("message");
return JSON.stringify(body);
}
function transformREST(s) {
let out = JSON.parse(s);
out.CustomMessage = get("message");
return JSON.stringify(out);
}
"""
)
}
I understand why this is done, but it's still a bad developer experience.
It's 2022, and you're putting Ecmascript in a string?
What problem are we solving here?
It's common to have some middleware logic that runs before or after a request,
either to transform the request or the response.
At WunderGraph, we've got a mutatingPreResolveHook
as well as a mutatingPostResolveHook
for exactly this use case. The difference is that you can write the logic in TypeScript,
all hooks are typesafe, you get autocompletion,
you can import any npm package you want,
you can use async await,
you can even test and debug your hooks.
The solution above is a hack.
It's a hack that's necessary because you can't use GraphQL to write middleware.
GraphQL is not XSLT
Let's have a look at the next example:
type Query {
anonymous: [Customer]
@rest (endpoint: "https://api.com/customers",
transforms: [
{pathpattern: ["[]","name"], editor: "drop"}])
known: [Customer]
@rest (endpoint: "https://api.com/customers")
}
This reminds me a lot of XSLT.
If you're not familiar with XSLT, it's a language to transform XML documents.
The problem with this example is that we're trying to use GraphQL as a transformation language.
If you want to filter something, you can do so very easily with map
or filter
in TypeScript/Javascript.
GraphQL was never designed to be used as a language to transform data.
I'm not sure how great of a developer experience this is.
I'd probably want type-safety and autocompletion for the transforms
argument.
Then, how do I debug this "code"?
How do I write tests for this?
If we had generated TypeScript models for the types,
we could not just make the transformation function type-safe,
but we'd also be able to easily write and run unit tests for it.
When it comes to creating great developer experiences,
what's super important is to give developers immediate feedback.
If my only option is to black-box-test my code by sending a request to the server,
I'm not going to be able to iterate as fast as I'd like to.
This whole approach feels like re-inventing an ESB with GraphQL.
Here's an XSLT example for reference:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th>
</tr>
<xsl:for-each select="catalog/cd">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
GraphQL is not an ORM
Another example shows how GraphQL can be abused as an ORM:
type Mutation {
addCustomerById(id: ID!, name: String!, email: String!): Customer
@dbquery(
type: "mysql"
table: "customer"
dml: INSERT
configuration: "mysql_config"
schema: "schema_name"
)
}
What are the problems here?
This is not just taking up a lot of space,
but also a lot less readable than a simple SQL query or using a proper ORM.
If we ignore the fact that all fields aren't type-safe,
there's another problem with this example, authentication and authorization.
We can't just expose this API to the public,
we have to protect it.
Here's the proposed solution:
access:
policies:
- type: Query
rules:
- condition: PREDICATE
name: name of rule
fields: [ fieldname ... ]
YAML!? Now you don't just have to deal with GraphQL,
but also with YAML.
Each of them is hard enough to deal with,
but together, it's definitely not going to be great.
To keep your API secure,
you have to keep your GraphQL Schema and this YAML file in sync.
GraphQL Directives are not easily discoverable and their usage is not obvious
Let me show you some obvious TypeScript code that transforms a string to lowercase:
const helloWorld = "Hello World";
const lower = helloWorld.toLowerCase(); // "hello world"
Now let's do the same with GraphQL:
type Query {
helloWorld: String
}
Let's imagine our GraphQL server allows the use of a @lowerCase
directive.
We can now use it like this:
type Query {
helloWorld: String @lowerCase
}
What's the difference between the two examples?
The helloWorld
object in TypeScript is recognized as a string,
so we can call toLowerCase
on it.
It's very obvious that we can call this method, because it's attached to the string type.
In GraphQL, there are no "methods" we can call on a field.
We can attach Directives to fields, but this is not obvious.
Additionally, the @lowerCase
directive only makes sense on a string field,
but GraphQL doesn't allow us to limit the usage of a directive to a specific type.
What seems simple when there's just a single directive can become quite complex when you have 10, 20 or even more directives.
To conclude this section, the implementation of a directive usually carries a number of rules that cannot be expressed in the GraphQL schema.
E.g. the GraphQL specification doesn't allow us to limit the @lowerCase
directive to string fields.
This means, linters won't work, autocompletion won't work properly, and validation will also not be able to detect these errors.
Instead, the detection of the misuse of a directive will be deferred to runtime.
With TypeScript, we're catching these errors at compile time.
With tsc --watch --noEmit
, we can even catch these errors while we're writing the code.
So, what's the right way to use GraphQL?
Pulumi has proven that TypeScript is a great language for configuration,
and we've taken a lot of inspiration from them.
AWS CDK is following a similar approach where you can use TypeScript (alongside other languages) to define your infrastructure.
I think it's best if we use GraphQL for what it's good at,
as an API query language.
There are two main camps when it comes to using GraphQL, the schema-first approach and the code-first approach.
I think that schema first is great when you'd like to define a pure GraphQL Schema,
whereas code-first can be a lot more powerful when you want to model complex use cases.
In case of the latter, the GraphQL Schema will be an artifact of what you've defined in code.
This allows you to use a "real" programming language, like TypeScript or C# to define your API,
e.g. with a powerful DSL, while the resulting contract is still a GraphQL Schema.
I don't see how there's much of a future for "Schema-only" GraphQL APIs,
but am happy to be proven wrong.
Conclusion
I think we will see some more people jumping on the "do everything with GraphQL" bandwagon,
and eventually these companies will come to terms with the fact that GraphQL is not a silver bullet.
GraphQL is a great tool for the API tool belt,
but definitely not a configuration language,
also not a transformation language,
nor a replacement for SQL,
and definitely not Terraform.
Cheers!
At WunderGraph, our understanding of GraphQL is a bit different than what you might have seen so far.
We use GraphQL as a server-side language to manage and query API dependencies,
keeping the GraphQL layer hidden behind a JSON-RPC/REST API.
Posted on October 18, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.