Options
All
  • Public
  • Public/Protected
  • All
Menu

import-io-cli

import-io-cli

This toolchain allows import.io users and managed service providers to build out scalable extractor definitions by creating a modular Extractor Library.

To jump straight into the browser context methods available, see IContext.

Getting started

Download

CLI

Mac

Windows x64

Windows x86

Linux Debian amd64

Linux Debian armel

Desktop App

Mac

Windows

Linux

Concepts

An import.io Extractor Library is a git repository, that contains a library of modules and extractors for one or more organizations.

There are multiple types of modules:

  • Action
    • A browser control and logic building block
    • Uses a browser context to control the browser - see IContext
    • Action may be used as an interface
      • A default definition may be provided
      • Named parameters (e.g. domain, country)
  • Schema
    • A definition of what columns are expected to be returned
  • Extraction
    • A definition of what to extract on the page
  • Robot
    • An extractor template that is inherited from when creating an org extractor

Each instance of a module maps to a file within the git repository within the top-level src/library folder, and has a URI composed of the type and path, e.g. "robot:product/details".

Getting started

Install Google Chrome

Download and install Google Chrome if you don't already have it. (not Chromium)

Install the client

Download and install the import-io client via a pkg file (macOS), installer (windows) or tarball (linux).

Configure

To configure, run:

> import-io config

This will prompt you for your SAAS apikey and write it to a file .import-io.apikey in your home directory.

Give it a try

Before goin on the next steps verify you can start the browser up and get a REPL to control it by running:

import-io browser:launch

This will start a browser and give you a REPL interface to control the browser, as well as the dev tools for the browser.

If this fails maybe the CLI cannot find your Chrome instance, set the CHROME_PATH environment variable.

For example on Linux Ubuntu add this to your .profile (with the proper path)

export CHROME_PATH=/usr/bin/google-chrome

Usage

$ npm install -g import-io-cli
$ import-io COMMAND
running command...
$ import-io (-v|--version|version)
import-io-cli/1.0.0 darwin-x64 node-v14.10.1
$ import-io --help [COMMAND]
USAGE
  $ import-io COMMAND
...

Commands Reference

import-io action:compile

Compile an action to JS

USAGE
  $ import-io action:compile

OPTIONS
  -P, --parameters=parameters  (required) parameter values, key=value
  -a, --action=action          (required) action path
  -h, --help                   show CLI help

See code: src/commands/action/compile.ts

import-io action:implement

Implement an action interface

USAGE
  $ import-io action:implement

OPTIONS
  -I, --interface=interface    (required) path to where interface is
  -P, --parameters=parameters  (required) parameter values, key=value
  -h, --help                   show CLI help

See code: src/commands/action/implement.ts

import-io action:interface

Create a new interface with default implementation

USAGE
  $ import-io action:interface

OPTIONS
  -P, --parameters=parameters  parameter name
  -h, --help                   show CLI help
  -i, --inputs=inputs          input name
  -p, --path=path              (required) new action path

See code: src/commands/action/interface.ts

import-io action:new

Create a new action (not an interface impl)

USAGE
  $ import-io action:new

OPTIONS
  -P, --parameters=parameters  parameter name
  -h, --help                   show CLI help
  -i, --inputs=inputs          input name
  -p, --path=path              (required) new action path

See code: src/commands/action/new.ts

import-io action:run:local

Run an action locally

USAGE
  $ import-io action:run:local

OPTIONS
  -C, --credentials=credentials  credentials for an auth extractor, username:password
  -I, --incognito                run with an incognito browser
  -P, --parameters=parameters    [default: ] parameter values, key=value
  -a, --action=action            (required) action path
  -c, --compile                  compile down to the code action (cannot use breakpoints)
  -h, --help                     show CLI help
  -i, --inputs=inputs            input values, key=value
  --proxy=proxy                  proxy, e.g. http://10.10.10.10:8000
  --proxyauth=proxyauth          proxyauth, e.g. user:password

See code: src/commands/action/run/local.ts

import-io action:run:remote

Run the action locally using an import.io remote browser

USAGE
  $ import-io action:run:remote

OPTIONS
  -C, --country=country          country code for proxy to checkout
  -I, --incognito                run with an incognito browser
  -P, --parameters=parameters    [default: ] parameter values, key=value
  -a, --action=action            (required) action path
  -c, --credentials=credentials  Uses credentials for auth extractor, with username:password
  -h, --help                     show CLI help
  -i, --inputs=inputs            input values, key=value
  -r, --residential              Use residential proxies

See code: src/commands/action/run/remote.ts

import-io browser:launch [FILE]

Launch a browser for testing

USAGE
  $ import-io browser:launch [FILE]

OPTIONS
  -h, --help             show CLI help
  --proxy=proxy          proxy, e.g. http://10.10.10.10:8000
  --proxyauth=proxyauth  proxyauth, e.g. user:password

See code: src/commands/browser/launch.ts

import-io cache:clear

Clear cache

USAGE
  $ import-io cache:clear

See code: src/commands/cache/clear.ts

import-io config [FILE]

Intialize developer configuration

USAGE
  $ import-io config [FILE]

OPTIONS
  -h, --help  show CLI help

See code: src/commands/config.ts

import-io extraction:new PATH

Create a new extraction

USAGE
  $ import-io extraction:new PATH

OPTIONS
  -h, --help           show CLI help
  -s, --schema=schema  (required) schema path

See code: src/commands/extraction/new.ts

import-io extractor:build

Build extractor (s)

USAGE
  $ import-io extractor:build

OPTIONS
  -h, --help           show CLI help
  -o, --org=org        (required) org slug
  -p, --prefix=prefix  path prefix to search in

See code: src/commands/extractor/build.ts

import-io extractor:deploy

Deploy an extractor

USAGE
  $ import-io extractor:deploy

OPTIONS
  -a, --account=account  account (id) to deploy with
  -b, --branch=branch    (required) branch to deploy
  -h, --help             show CLI help
  -o, --org=org          (required) org slug
  -p, --prefix=prefix    path prefixes to search in

See code: src/commands/extractor/deploy.ts

import-io extractor:new

Create a new robot

USAGE
  $ import-io extractor:new

OPTIONS
  -P, --parameters=parameters  [default: ] parameter values, key=value
  -a, --auth                   whether or not to scaffold out auth actions and dependencies
  -h, --help                   show CLI help
  -o, --org=org                (required) org slug
  -p, --path=path              path override
  -r, --robot=robot            (required) path to robot

See code: src/commands/extractor/new.ts

import-io extractor:run

Run an extractor

USAGE
  $ import-io extractor:run

OPTIONS
  -b, --branch=branch        (required) branch to run
  -d, --deploy               deploy before running, if false will run the currently saved extractor
  -e, --extractor=extractor  (required) path to extractor directory
  -h, --help                 show CLI help
  -i, --inputs=inputs        input values, key=value (NOT SUPPORTED)
  -o, --org=org              (required) org slug
  -w, --wait                 whether or not to wait until the crawl run completes

See code: src/commands/extractor/run.ts

import-io extractor:run:local

Run an extractor locally

USAGE
  $ import-io extractor:run:local

OPTIONS
  -c, --compile              compile down to the code action (cannot use breakpoints)
  -e, --extractor=extractor  (required) path to extractor directory
  -h, --help                 show CLI help
  -i, --inputs=inputs        (required) input values, key=value
  -o, --org=org              (required) org slug
  --proxy=proxy              proxy, e.g. http://10.10.10.10:8000
  --proxyauth=proxyauth      proxyauth, e.g. user:password

See code: src/commands/extractor/run/local.ts

import-io extractor:run:remote

Run an extractor remotely

USAGE
  $ import-io extractor:run:remote

OPTIONS
  -e, --extractor=extractor  (required) path to extractor directory
  -h, --help                 show CLI help
  -i, --inputs=inputs        input values, key=value
  -o, --org=org              (required) org slug

See code: src/commands/extractor/run/remote.ts

import-io extractor:tests:functional

Carry out a crawl run then compare the data

USAGE
  $ import-io extractor:tests:functional

OPTIONS
  -b, --branch=branch        (required) branch to run
  -d, --deploy               deploy before running, if false will run the currently saved extractor
  -e, --extractor=extractor  (required) path to extractor directory
  -h, --help                 show CLI help
  -o, --org=org              (required) org slug

See code: src/commands/extractor/tests/functional.ts

import-io extractor:tests:unit

Check that the extractions still give the expected data

USAGE
  $ import-io extractor:tests:unit

OPTIONS
  -h, --help           show CLI help
  -o, --org=org        (required) org slug
  -p, --prefix=prefix  path prefix to search in

See code: src/commands/extractor/tests/unit.ts

import-io extractor:tests:update

Import a crawl run as a set of test cases

USAGE
  $ import-io extractor:tests:update

OPTIONS
  -b, --branch=branch          extractor branch
  -c, --crawlRunId=crawlRunId  id of crawl run to import
  -e, --extractor=extractor    (required) path to extractor directory
  -h, --help                   show CLI help
  -o, --org=org                (required) org slug

See code: src/commands/extractor/tests/update.ts

import-io function:new PATH

Create a new function

USAGE
  $ import-io function:new PATH

OPTIONS
  -h, --help  show CLI help

See code: src/commands/function/new.ts

import-io help [COMMAND]

display help for import-io

USAGE
  $ import-io help [COMMAND]

ARGUMENTS
  COMMAND  command to show help for

OPTIONS
  --all  see all commands in CLI

See code: @oclif/plugin-help

import-io init

Initialize a library

USAGE
  $ import-io init

OPTIONS
  -h, --help     show CLI help
  -o, --org=org  org slug

See code: src/commands/init.ts

import-io robot:new

Create a new robot

USAGE
  $ import-io robot:new

OPTIONS
  -P, --parameters=parameters                    (required) parameters
  -a, --authentication=authentication            authentication action entrypoint
  -c, --checkAuthentication=checkAuthentication  validate authentication action entrypoint
  -e, --entryPoint=entryPoint                    (required) action entrypoint
  -h, --help                                     show CLI help
  -r, --path=path                                (required) path to robot
  -s, --schema=schema                            (required) path to schema

See code: src/commands/robot/new.ts

import-io schema:new PATH

Create a new schema

USAGE
  $ import-io schema:new PATH

OPTIONS
  -h, --help  show CLI help

See code: src/commands/schema/new.ts

import-io source:deploy

Deploy a source

USAGE
  $ import-io source:deploy

OPTIONS
  -c, --collection=collection  collection to deploy to
  -h, --help                   show CLI help
  -o, --org=org                org slug
  -p, --prefix=prefix          path prefixes to search in
  -r, --robot=robot            robot filter
  -s, --source=source          source slug

See code: src/commands/source/deploy.ts