German Robayo
Posted on August 24, 2020
Introduction
Hi there! It's been a while since my last post and a cool feature that I've been working on recently it's roughly done, and it's really cool!
The feature is jump-to-definition (I'll abbreviate that to JTD in the rest of the post), and it allows rendered source code to be more navigatable.
TL;DR
To see some examples of this feature, you can visit the following links:
-
Prelude.JSON.renderAs
https://hydra.dhall-lang.org/build/70743/download/1/docs/JSON/renderAs.dhall.html - Every file with the
JumpTo
prefix on this list https://hydra.dhall-lang.org/build/70746/download/1/docs/
Jump to definition
The main purpose of using a documentation generator for your projects (in any programming language) is to make it easy for your users to learn how to use your library/framework/package/whatsoever. Simple things such as analyzing source-code comments (as dhall-docs
already does) are good enough, but we can always add more features that make code discoverable and easier to read.
Jump to definition is a feature that most documentation generators tools have. Let's use haddock
as an example: when haddock
generates the documentation of a Haskell packages it generates HTML for:
- The source-code comments that the user marked as documentation.
- The actual source code of the package.
If we use the dhall
package, for example:
- This is the generated documentation for
Dhall.Core
(which re-exportsDhall.Syntax
) https://hackage.haskell.org/package/dhall-1.34.0/docs/Dhall-Core.html; - This is how
haddock
renders the source code for theDhall.Syntax
module https://hackage.haskell.org/package/dhall-1.34.0/docs/src/Dhall.Syntax.html
If you open the rendered source code for Dhall.Syntax
, you will see that:
- It is rendered using syntax-highlighting
- It's full of links on the data-types and function names, which suggests you that it's clickable and it allows you to share the implementation of anything inside that module. For instance, you can see here the definition of the
Expr
data type: the Dhall AST.
We want to add the same for dhall-docs
: take a file and do static analysis to see where each name was defined, highlight it correctly when the user hovers it and jump to the definition of that name when the user clicks it.
Implementation
The Dhall AST
The Expr
data-type from the dhall
package contains a special constructor useful to get the source-span of a parsed expression:
data Expr s a =
...
| Note s (Expr s a)
...
(you can see its haddock here)
The purpose of Note
is to create useful parse and type-errors by underlining the problematic expression. The Dhall parser outputs an Expr Src Import
(explanations of Src
and Import
can be found on Expr
haddock) where each meaningful expression is wrapped on a Note
constructor. Src
s contains metadata information about the parsed expression, such as source text and line and column numbers. For example, if you have the following expression:
1 + 2
it will be parsed to something (roughly) like this:
Note (source code info of the "1 + 2" expression)
(NaturalPlus -- for the `+` operator
(Note (source code info for the "1") (NaturalLit 1))
(Note (source code info for the "1") (NaturalLit 2))
)
Besides Note
, there are other constructors that preserve extra-information about the source code. It is mostly used by the dhall format
and dhall lint
commands.
Code renderer
Now the question is, how can I manipulate/traverse the Expr
in a way that lets me generates HTML for the following?
- The declaration & use of names (variables)
- Relative imports
- The rest of the code, that should not be affected at all.
I first started making this simple data-type:
-- we will come to the ??? later
data NameDecl = NameDecl Src Text ???
data SourceCodeType
= ImportExpr Dhall.Import
| NameUse NameDecl
| NameDeclaration NameDecl
data SourceCodeFragment = SourceCodeFragment Src SourceCodeType
You can think of each SourceCodeFragment
as a subsequent section of the source code. Similar to a token (but not exactly that, though). Each one is discriminated by its SourceCodeType
, which carries the information of how the code should be rendered in the HTML.
Now we need to transform an Expr Src Import
to a [SourceCodeFragment]
and that's where the fragments
function comes:
fragments :: Expr Src Import -> [SourceCodeFragment]
The implementation of the fragments
function is quite large to fit in this post (110 lines, to be exact) but what it does is run a simple context-analysis on the AST to generate SourceCodeFragment
whenever a name was declared and used. It is clever enough to know when a name was not declared, and when a name was unused and showing no highlight on its HTML (i.e. not generating a SourceCodeFragment
for it).
All the work is done in the infer
function, declared inside fragments
:
-- explanation for ??? will come up later
infer :: Dhall.Context NameDecl -> Expr Src Import -> Writer [SourceCodeFragment] ???
Dhall.Context
is a context table that allows us to insert and query for names when traversing an AST. We insert names when it makes sense to (i.e. let-bindings, record fields, and lambda parameters.
All of this work allowed dhall-docs
to generate jump-to-definition for the following:
- Let-bindings
- Lambda argument names
- Record (literals and types) field names
- Imports (relative and remote imports)
You can see here for a demo. The test cases are named with the feature that is being shown there.
What about the ???
on the code?
let-binding and lambda argument names share the same JTD logic:
- Calculate the source span position of the binding name
- Traverse the sibling expressions on both
Let
andLam
adding the name to the context accordingly.
However, adding support for record field is slightly different. We want to highlight the names where fields in selector-expressions were defined. In the following example we want to highlight foo
when the user hovers the foo
in the selector-expression a.foo
:
let a = { foo = 2 }
in a.foo
Pushing the foo
label to the context isn't the right solution because handling the exact expression to remote the foo
label would be complicated.
What we instead do, is to use another data-type:
data JtdInfo
= RecordFields (Set.Set NameDecl)
| NoInfo
???
corresponds to JtdInfo
. You can think of JtdInfo
as extra information recovered from the static-analysis that could aid in more precise generated JTD. Replacing ???
on the above snippets, we got:
data NameDecl = NameDecl Src Text JtdInfo
infer :: Dhall.Context NameDecl -> Expr Src Import -> Writer [SourceCodeFragment] JtdInfo
Now each NameDecl
has an associated JtdInfo
, and every expression returns a JtdInfo
from infer
. Now we can pattern match against the Field
constructor to correctly setup JTD:
Field e (FieldSelection (Just Src{srcEnd=posStart}) label (Just Src{srcStart=posEnd})) -> do
fields <- do
dhallType <- infer context e
case dhallType of
NoInfo -> return mempty
RecordFields s -> return $ Set.toList s
let src = makeSrcForLabel posStart posEnd label
let match (NameDecl _ l _) = l == label
case filter match fields of
x@(NameDecl _ _ t) : _ -> do
Writer.tell [SourceCodeFragment src (NameUse x)]
return t
_ -> return NoInfo
We setup JTD only if it makes sense. Thankfully pattern matching helps us elegantly handle all cases.
To end
This was the last feature that I developed as part of my GSoC project. Keep in tune to read my last report about a summary of what the whole project and a story about all my experience through.
Thanks for reading!
Posted on August 24, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.