DATA: The Data Package Manager CLI

Getting started

The data is a command-line tool aimed to help publishers to prepare and upload data to the DataHub. With data you will be able to:

  • Publish Data Package to DataHub
  • Get Data Package from DataHub
  • Remove uploaded Data Package from DataHub
  • Get information about particular Data Package
  • Normalize Data Package according to the specs
  • Validate your data to ensure its quality
  • Set up configuration file in order to publish


Installing binaries without npm

On the releases page, you can download pre-built binaries for MacOS and LinuxOS x64. You may need to put the pre-built binary in the bin directory (e.g.: /usr/local/bin/).

mv path/to/data-{os-distribution} /usr/local/bin/data

Installing from npm

You can also install it from npm as follows: npm install -g data


You can see the latest commands and get help by doing:

data --help

The output of the help command:

❒ data [options] <command> <args>

    push        [path]        Push data to the DataHub
    get         [pkg-id]      Get data from DataHub
    purge       [owner/name]  Permanently deletes data from DataHub
  Data Package specific:
    info        [pkg-id]      Get info on data
    normalize                 Normalize datapackage.json
    validate                  Validate Data Package structure

    config                    Set up configuration
    help        [cmd]         Show help on cmd

-h, --help              Output usage information
-v, --version           Output the version


Data can be configured using data config[ure] command. It will ask you to provide a username, secretToken, server and bitStore addresses of DataHub.

The config is stored in ~/.datahub/config, you can edit it with text editor. Simple example config file can look like this:

username = myname
access_token = mykey
server = server URL for publishing Eg:



To publish a Data Package, go to the Data Package directory (with datapackage.json) and run:

data push

If your configured username and secretToken are correct, data will upload datapackage.json and all relevant resources to the DataHub server.


To get Data Package run the following command:

data get <publisher>/<package>
New Data Package will be downloaded into current working directory.


To delete permanently Data Package from DataHub, you can use dpm purge command:

data purge


You can get information about particular Data Package

data info


To normalize Data Package descriptor according to the specs

data norm[alize] [path]


To validate Data Package descriptor against schema

data validate [path | URL]


To set up configuration file:

data config[ure]