※ Open data is data that is openly accessible, exploitable, editable and shared by anyone for any purpose. Open data is licensed under an open license.(https://en.wikipedia.org/wiki/Open_data)

For example

We have released the CLI tool that is dim (Open Data Package Manager) to manage open data.

c-3lab / dim

📦 dim: Manage the open data in your project like a package manager.

dim

Data Installation Manager: Manage the open data in your project like a package manager.

Join community

We are looking for members to develop together as an open source community.

Slack

Features

Document

For more information about how to use it, please refer to this document.

Quick Start

Install the dim

Install the dim from binary files or Run the dim using Deno

Install the dim from binary files

Download the dim from binary files.

aarch64-apple-darwin

curl -L https://github.com/c-3lab/dim/releases/latest/download/aarch64-apple-darwin-dim -o /usr/local/bin/dim

x86_64-apple-darwin

curl -L https://github.com/c-3lab/dim/releases/latest/download/x86_64-apple-darwin-dim -o /usr/local/bin/dim

x86_64-pc-windows-msvc

curl https://github.com/c-3lab/dim/releases/latest/download/x86_64-pc-windows-msvc-dim.exe -o C:\Users\user-name\dim.exe

…

View on GitHub

Demo

I thought open data should be managed by a package manager just like the software (ex: npm, apt, pip, gem...).

When fetching the open data, it would be convenient for users to be able to fetch them with commands like:

npm install xxxxx

After data is installed, it is recorded in a dim.json such as package.json

Stop chaotic open data management

A systematic method of managing software and libraries has been established by package managers(npm, gem, apt...). However, there is no systematic management approach for open data users.

If you were given the assignment to visualize a map using some kind of open data, how would you prepare the data?

The following flow is a common example.

Search for open data you want from Google
When you find the open data you want, download it from your browser
Check the open data and return to 1 if the open data is incomplete or not what you wanted
Processing the open data for utilization (character encoding conversion, file format conversion...)
Save the open data in the project directory or database

This process is sufficient for simple projects to utilize.
However, you may want to record the specs(name, URL, last-updated, etc...) of open data.

Project developed by multiple people
Projects to be maintained in the medium to long term
Public projects (published on GitHub as OSS, etc.) , etc.

List of required the open data specifications

If you download the open data from various sites and process datasets, you may forget where you downloaded the open data from or how you processed the data. Therefore, it is useful to record the following specifications.

URL
Last-updated
Version
Post-processing
Hash value , etc.

Approach

We have released a CLI tool the dim (Open Data Package Manager) v1.0.

https://github.com/c-3lab/dim

Feature

(1) Support for search/download/processing/recording processes

The dim support search/download/processing/recording processes. The dim can also execute a series of processes by interactive commands.

(2) Support for post-processing commonly used in the data processing

The dim includes several post-processes commonly used in data processing. The post-process is recorded as well as the data URL. You can also use your scripts as post-process.

(3) Prepare data in one step using the existing data specification file

You can fetch and process all open data in one step by using a data specification file(dim.json) that has already been recorded.
As a user, you only share a data specification file(dim.json) without including the open data body in the repository by publishing the data specification file on GitHub.
(This is the same as publishing package.json etc. to GitHub)

About the development environment

Language: TypeScript
Execution environment: Deno
CI/CD： GitHub Actions
- CI: Test/Lint/Type Check/Coverage
- CD: Automatically publish a release by tagging, building dim binary & upload

We are using Deno, which is expected to replace Node.js. We evaluated Deno for the following reasons.

simple to set up and easy to start projects
Lint and formatter are provided as standard functions
TypeScript can be executed as is etc.

Usage dim

Install dim

Download the dim from binary files.

aarch64-apple-darwin

curl -L https://github.com/c-3lab/dim/releases/latest/download/aarch64-apple-darwin-dim -o /usr/local/bin/dim

x86_64-apple-darwin

curl -L https://github.com/c-3lab/dim/releases/latest/download/x86_64-apple-darwin-dim -o /usr/local/bin/dim

x86_64-pc-windows-msvc

curl https://github.com/c-3lab/dim/releases/latest/download/x86_64-pc-windows-msvc-dim.exe -o C:\Users\user-name\dim.exe

x86_64-unknown-linux-gnu

curl -L https://github.com/c-3lab/dim/releases/latest/download/x86_64-unknown-linux-gnu-dim -o /usr/local/bin/dim

Grant user execution permission

chmod u+x /usr/local/bin/dim

New Project

init the project

Generate dim.json, dim-lock.json, and data_files/ by the init command.

$ dim init

Install a data

This command stores information about installed data in dim.json and dim-lock.json.

$ dim install https://example.com -n "example"

Installed data is saved in `data_files/`.

$ ls ./data_files

Install all data written to dim.json shared by members

Install all data written to dim.json shared by members.

Make sure existing the dim.json in the current directory

$ ls ./

dim.json  ....

Install all data written in the dim.json

$ dim install

Installed data is saved in `data_files/`.

$ ls ./data_files

The dim has many other features required for package manager in addition to these functions.
https://github.com/c-3lab/dim#command-usage

For an example of functions

Search the open data
Update the open data
Uninstall the open data
Download the dim.json via the internet
Use the dim from python(https://github.com/c-3lab/dim-python)

etc.

We have released version v1.0 of the open data package manager dim, which manages the open data like a package manager.

There are still a lot of features We want to add. If there is someone who can sympathize with the issues and solve the issue together, we would be very welcome.

c-3lab / dim

📦 dim: Manage the open data in your project like a package manager.

dim

Data Installation Manager: Manage the open data in your project like a package manager.

Join community

We are looking for members to develop together as an open source community.

Slack

Features

Document

For more information about how to use it, please refer to this document.

Quick Start

Install the dim

Install the dim from binary files or Run the dim using Deno

Install the dim from binary files

Download the dim from binary files.

aarch64-apple-darwin

curl -L https://github.com/c-3lab/dim/releases/latest/download/aarch64-apple-darwin-dim -o /usr/local/bin/dim

x86_64-apple-darwin

curl -L https://github.com/c-3lab/dim/releases/latest/download/x86_64-apple-darwin-dim -o /usr/local/bin/dim

x86_64-pc-windows-msvc

curl https://github.com/c-3lab/dim/releases/latest/download/x86_64-pc-windows-msvc-dim.exe -o C:\Users\user-name\dim.exe

…

View on GitHub

🚀 How to manage the Open Data in your project / Release package manager for open data📦

ryo-ma

For example

c-3lab / dim

📦 dim: Manage the open data in your project like a package manager.

dim

Join community

Features

Document

Quick Start

Install the dim

Install the dim from binary files

Demo

Stop chaotic open data management

List of required the open data specifications

Approach

Feature

(1) Support for search/download/processing/recording processes

(2) Support for post-processing commonly used in the data processing

(3) Prepare data in one step using the existing data specification file

About the development environment

Usage dim

Install dim

Grant user execution permission

New Project

init the project

Install a data

Installed data is saved in data_files/.

Install all data written to dim.json shared by members

Make sure existing the dim.json in the current directory

Install all data written in the dim.json

Installed data is saved in data_files/.

For an example of functions

c-3lab / dim

📦 dim: Manage the open data in your project like a package manager.

dim

Join community

Features

Document

Quick Start

Install the dim

Install the dim from binary files

Join Our Newsletter. No Spam, Only the good stuff.

Related

Installed data is saved in `data_files/`.

Installed data is saved in `data_files/`.