v0.9.7

GeoWave Command Line Controls

Commands

Helpful Commands & Flags

GeoWave supports a few extra commands that can be used for informational purposes to debug or explore command usage.

Commands:

Debug
Version
Help
Explain

Debug Flag (--debug)

Use the debug flag to increase the debug logging output by GeoWave on the console to DEBUG. By default, it is set to WARN. This flag must come right after 'geowave' and before any subcommand:

geowave --debug <command> <subcommand> <options...>

Version Flag (--version)

The version flag will output the build arguments that were used to build GeoWave, as well as the version of the GeoWave tools jar you’re using:

geowave --version

Help Command

The help command will show arguments and their defaults. It can be prepended to any GeoWave command. If you use it while also specifying a sub-command and its arguments, that command’s help information will be displayed:

geowave help <command> <subcommand>

Explain Command

The explain command will show a simplified tabular view of the arguments and their current values. Use this to determine what values are being passed to GeoWave. It also shows hidden parameters and their values, if there are any. An example would be additional Accumulo options:

geowave explain config addstore -t accumulo

Config Commands

Commands that affect local configuration only (Required options are designated with an *)

Commands:

AddIndex
AddIndexGrp
AddStore
CpIndex
CpStore
GeoServer
List
NewCryptoKey
RmIndex
RmIndexGrp
RmStore
Set

geowave config addindex

NAME

geowave config addindex - Configure an index for usage in GeoWave

SYNOPSIS

geowave config addindex [options] <name>

DESCRIPTION

The geowave config addindex operator will create a local index configuration that can be reused but is not associated with a store until data is ingested.

OPTIONS

-c --crs (will only be shown if you have already defined spatial or satial_temporal as your type)
- The native Coordinate Reference System used within the index. All spatial data will be projected into this CRS for appropriate indexing as needed.
- Default: EPSG:4326
--d, --default
- Make this the default index creating stores
--indexName
- A custom name can be given to this index. Default name will be the based on configuration parameters.
-np, --numPartitions
- The number of partitions. Default partitions will be 1.
- Default: 1
-ps, --partitionStrategy
- The partition strategy to use. Default will be none.
- Default: NONE
- Possible Values: [NONE, HASH, ROUND_ROBIN]
* -t, --type
- The type of index, such as spatial, or spatial_temporal

geowave config addindexgrp

NAME

geowave config addindexgrp - Create an index group for usage in GeoWave

SYNOPSIS

geowave config addindexgrp [options] <name> <comma separated list of indexes>

DESCRIPTION

The geowave config addindexgrp operator will group multiple index configurations together given a name. This acts as a convenience for re-using multiple indices together on ingest.

OPTIONS

There are currently no options for this command

geowave config addstore

NAME

geowave config addstore - Create a store within Geowave

SYNOPSIS

geowave config addstore [options] <name>

DESCRIPTION

The geowave config addstore operator will create a new store in GeoWave.

OPTIONS

-d, --default
- Make this the default store in all operations
*-t, --type <arg>
- The type of store, such as accumulo, memory, etc.
- Required!
- When -t accummulo option is used, additional options are:
  - --gwNamespace
    
    The geowave namespace
    
    Default is no namespace
  - *-i, --instance
    
    The Accumulo instance ID
    
    Required!
  - *-p, --password
    
    The password for the user
    
    Required!
  - *-u, --user
    
    A valid Accumulo user ID
    
    Required!
  - *-z, --zookeeper
    
    A comma-separated list of zookeeper servers that an Accumulo instance is using
    
    Required!
- When -t, --type hbase option is used, additional options are:
  - -d, --default
    
    Make this the default index creating stores
  - -np, --numPartitions
    
    The number of partitions.
    
    Default: 1
  - -ps, --partitionStrategy
    
    The partition strategy to use.
    
    Default: NONE
    
    Possible Values: [NONE, HASH, ROUND_ROBIN]

geowave config aws

NAME

geowave config aws - Create a local configuration for aws s3

SYNOPSIS

geowave config aws <AWS S3 endpoint URL> (for example s3.amazonaws.com)

DESCRIPTION

This command will create a local configuration for aws s3

OPTIONS

geowave config cpindex

NAME

geowave config cpindex - Copy and modify existing index configuration

SYNOPSIS

geowave config cpindex [options] <name> <new name>

DESCRIPTION

The geowave config cpindex operator will copy and modify an existing index configuration. It is possible to override values as you copy, such as cpindex old new --gwNamespace new_namespace.

OPTIONS

-d, --default
- Make this the default index creating stores
-np, numPartitions
- The number of partitions.
- Default: 1
-ps, --partitionStrategy
- The partition strategy to use.
- Default: NONE
- Possible Values: [NONE, HASH, ROUND ROBIN]

geowave config cpstore

NAME

geowave config cpstore - Copy and modify existing store configuration

SYNOPSIS

geowave config cpstore [options] <name> <new name>

DESCRIPTION

The geowave cpstore operator will copy and modify an existing GeoWave store. It is possible to override values as you copy, such as cpstore old new --gwNamespace new_namespace.

OPTIONS

-d, --default
- Makes this the default store in all operations

geowave config geoserver

NAME

geowave config geoserver - Create a local configuration for GeoServer

SYNOPSIS

geowave config geoserver [options] <GeoServer URL (for example http://localhost:8080/geoserver or https://localhost:8443/geoserver), or simply host:port and appropriate assumptions are made>

DESCRIPTION

This command will create a local configuration for connecting to GeoServer.

OPTIONS

-p, --password
- GeoServer Password - Can be specified as 'pass:<password>', 'file:<local file containing the password>', 'propfile:<local properties file containing the password>:<property file key>', 'env:<variable containing the pass>', or stdin
-u, --username
- GeoServer User
-ws, --workspace
- GeoServer Default Workspace

SSL CONFIGURATION OPTIONS

--sslKeyManagerAlgorithm
- Specify the algorithm to use for the keystore.
--sslKeyManagerProvider
- Specify the key manager factory provider.
--sslKeyPassword
- Specify the password to be used to access the server certificate from the specified keystore file.
- Can be specified as 'pass:<password>', 'file:<local file containing the password>', 'propfile:<local properties file containing the password>:<property file key>', 'env:<variable containing the pass>', or stdin
--sslKeyStorePassword
- Specify the password to use to access the keystore file.
- Can be specified as 'pass:<password>', 'file:<local file containing the password>', 'propfile:<local properties file containing the password>:<property file key>', 'env:<variable containing the pass>', or stdin
--sslKeyStorePath
- Specify the absolute path to where the keystore file is located on system. The keystore contains the server certificate to be loaded.
--sslKeyStoreProvider
- Specify the name of the keystore provider to be used for the server certificate.
--sslKeyStoreType
- The type of keystore file to be used for the server certificate.
--sslSecurityProtocol
- Specify the Transport Layer Security (TLS) protocol to use when connecting to the server. By default, the system will use TLS.
--sslTrustManagerAlgorithm
- Specify the algorithm to use for the truststore.
--sslTrustManagerProvider
- Specify the trust manager factory provider.
--sslTrustStorePassword
- Specify the password to use to access the truststore file.
- Can be specified as 'pass:<password>', 'file:<local file containing the password>', 'propfile:<local properties file containing the password>:<property file key>', 'env:<variable containing the pass>', or stdin
--sslTrustStorePath
- Specify the absolute path to where truststore file is located on system.
- The truststore file is used to validate client certificates.
--sslTrustStoreProvider
- Specify the name of the truststore provider to be used for the server certificate.
--sslTrustStoreType
- Specify the type of key store used for the truststore, i.e. JKS (Java KeyStore).

geowave config hdfs

NAME

geowave config hdfs - Create a local configuration for HDFS

SYNOPSIS

geowave config hdfs [options] <HDFS DefaultFS URL>

DESCRIPTION

This command will create a local configuration for HDFS.

OPTIONS

geowave config list

NAME

geowave config list - List property name within cache

SYNOPSIS

geowave config list [options]

DESCRIPTION

The geowave config list operator will list all properties in local config. -f or --filter will allow you to give a regex to filter the list by (useful regexes may be 'store' or 'index' to isolate properties for one or the other or a particular store/index name to further isolate the list).

OPTIONS

-f or --filter <arg>
- Filter list by a regex

geowave config newcryptokey

NAME

geowave config newcryptokey - Generate a new security cryptography key for use with configuration properties

SYNOPSIS

geowave config newcryptokey

DESCRIPTION

This command will generate a new security cryptography key for use with configuration properties. This is primarily used if there is a need to re-encrypt the local configurations based on a new security token, should the old one have been compromised.

OPTIONS

There are currently no options available for this command

geowave config rmindex

NAME

geowave config rmindex - Remove index configuration from Geowave configuration

SYNOPSIS

geowave config rmindex [options] <name>

DESCRIPTION

The geowave config rmindex operator will remove an index configuration from GeoWave configuration

OPTIONS

There are currently no options for this command

geowave config rmindexgrp

NAME

geowave config rmindexgrp - removes an index group from GeoWave configuration

SYNOPSIS

geowave config rmindexgrp [options] <name>

DESCRIPTION

The geowave config rmindexgrp operator will remove an index group from GeoWave configuration

OPTIONS

There are currently no options for this command

geowave config rmstore

NAME

geowave config rmstore - removes an existing store from GeoWave configuration

SYNOPSIS

geowave config rmstore [options] <name>

DESCRIPTION

The geowave config rmstore operator will remove an existing store from GeoWave configuration GeoWave store

OPTIONS

There are currently no options for this command

geowave config set

NAME

geowave config set - will set a valid property name within the cache

SYNOPSIS

geowave config set [options]

DESCRIPTION

The geowave config set operator will set a valid property name within the cache. This can be useful if you want to update a particular property of a index or store.

OPTIONS

There are currently no options for this command

Ingest Commands

Commands that ingest data directly into GeoWave or stage data to be ingested into GeoWave (Required options are designated with an *)

Commands:

KafkaToGW
ListPlugins
LocalToGW
LocalToHdfs
LocalToKafka
LocalToMRGW
MRToGW
SparkToGW

geowave ingest kafkaToGW

NAME

geowave ingest kafkaToGW - Subscribe to a Kafka topic and ingest into GeoWave

SYNOPSIS

geowave ingest kafkaToGW [options] <store name> <comma delimited index/group list>

DESCRIPTION

The geowave ingest kafkaToGW operator will ingest supported files that already exist in HDFS

OPTIONS

--autoOffsetReset
- What to do when there is no initial offset in ZooKeeper or if an offset is out of range:
  - smallest : automatically reset the offset to the smallest offset
  - largest : automatically reset the offset to the largest offset
  - anything else: throw exception to the consumer
--avro.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--avro.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--avro.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--batchSize
- The data will automatically flush after this number of entries
- Default: 10000
--consumerTimeoutMs
- By default, this value is -1 and a consumer blocks indefinitely if no new message is available for consumption. By setting the value to a positive integer,a timeout exception is thrown to the consumer if no message is available for consumption after the specified timeout value.
-x, --extension
- individual or comma-delimited set of file extensions to accept (optional)
--fetchMessageMaxBytes
- The number of bytes of messages to attempt to fetch for each topic-partition in each fetch request. These bytes will be read into memory for each partition, so this helps control the memory used by the consumer. The fetch request size must be at least as large as the maximum message size the server allows or else it is possible for the producer to send messages larger than the consumer can fetch.
-f, --formats
- Explicitly set the ingest formats by name (or multiple comma-delimited formats), if not set all available ingest formats will be used
--gdelt.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--gdelt.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--gdelt.extended
- A flag to indicate whether extended data format should be used
- Default: false
--gdelt.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--geolife.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--geolife.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--geolife.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--geotools-raster.coverage
- Optional parameter to set the coverage name (default is the file name)
--geotools-raster.crs
- A CRS override for the provided raster file
--geotools-raster.histogram
- Build a histogram of samples per band on ingest for performing band equalization
- Default: false
--geotools-raster.mergeStrategy
- Optional parameter to choose a tile merge strategy used for mosaic.
- Default behavior will be none.
- Alternatively 'no-data' will mosaic the most recent tile over previous tiles, except where there are no data values.
--geotools-raster.nodata
- Optional parameter to set 'no data' values, if 1 value is giving it is applied for each band, if multiple are given then the first totalNoDataValues/totalBands are applied to the first band and so on, so each band can have multiple differing 'no data' values if needed
- Default: []
--geotools-raster.pyramid
- Build an image pyramid on ingest for quick reduced resolution query
- Default: false
--geotools-raster.separateBands
- Optional parameter to separate each band into its own coverage name. By default the coverage name will have '_Bn' appended to it where n is the band’s index.
- Default: false
--geotools-raster.tileSize
- Optional parameter to set the tile size stored (default is 256)
- Default: 256
--geotools-vector.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--geotools-vector.data
- A map of date field names to the date format of the file. Use commas to separate each entry, then the first ':' character will separate the field name from the format. Use '\,' to include a comma in the format. For example: "time:MM:dd:YYYY,time2:YYYY/MM/dd hh:mm:ss" configures fields 'time' and 'time2' as dates with different formats
--geotools-vector.type
- Optional parameter that specifies specific type name(s) from the source file
- Default: []
--gpx.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--gpx.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--gpx.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--groupId
- A string that uniquely identifies the group of consumer processes to which this consumer belongs. By setting the same group id multiple processes indicate that they are all part of the same consumer group.
* --kafkaprops
- Properties file containing Kafka properties
--reconnectOnTimeout
- This flag will flush when the consumer timeout occurs (based on kafka property 'consumer.timeout.ms') and immediately reconnect
- Default: false
--tdrive.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--tdrive.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--tdrive.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--twitter.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--twitter.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--twitter.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
-v, --visibility
- The visibility of the data ingested (optional; default is 'public')
--zookeeperConnect
- Specifies the ZooKeeper connection string in the form hostname:port where host and port are the host and port of a ZooKeeper server. To allow connecting through other ZooKeeper nodes when that ZooKeeper machine is down you can also specify multiple hosts in the form hostname1:port1,hostname2:port2,hostname3:port3.

geowave ingest listplugins

NAME

geowave ingest listplugins - List supported data store types, index types, and ingest formats

SYNOPSIS

geowave ingest listplugins [options]

DESCRIPTION

This command will list all the data store types, index types, and ingest formats supported by the version of GeoWave being run.

OPTIONS

There are currently no options for this command

geowave ingest localToGW

NAME

geowave ingest localToGW - Ingest supported files in local file system directly, from S3 or from HDFS

SYNOPSIS

geowave ingest localToGW [options] <file or directory> <storename> <comma delimited index/group list>

DESCRIPTION

The geowave ingest localToGW operator will run the ingest code (parse to features, load features to geowave) against local file system content.

OPTIONS

--avro.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--avro.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--avro.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
-x, --extension
- individual or comma-delimited set of file extensions to accept (optional)
-f, --formats
- Explicitly set the ingest formats by name (or multiple comma-delimited formats), if not set all available ingest formats will be used
--gdelt.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--gdelt.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--gdelt.extended
- A flag to indicate whether extended data format should be used
- Default: false
--gdelt.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--geolife.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--geolife.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--geolife.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--geotools-raster.coverage
- Optional parameter to set the coverage name (default is the file name)
--geotools-raster.crs
- A CRS override for the provided raster file
--geotools-raster.histogram
- Build a histogram of samples per band on ingest for performing band equalization
- Default: false
--geotools-raster.mergeStrategy
- Optional parameter to choose a tile merge strategy used for mosaic.
- Default behavior will be none. Alternatively 'no-data' will mosaic the most recent tile over previous tiles, except where there are no data values.
- Default: none
--geotools-raster.nodata
- Optional parameter to set 'no data' values, if 1 value is giving it is applied for each band, if multiple are given then the first totalNoDataValues/totalBands are applied to the first band and so on, so each band can have multiple differing 'no data' values if needed
- Default: []
--geotools-raster.pyramid
- Build an image pyramid on ingest for quick reduced resolution query
- Default: false
--geotools-raster.separateBands
- Optional parameter to separate each band into its own coverage name. By default the coverage name will have '_Bn' appended to it where n is the band’s index.
- Default: false
--geotools-raster.tileSize
- Optional parameter to set the tile size stored (default is 256)
- Default: 256
--geotools-vector.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--geotools-vector.data
- A map of date field names to the date format of the file. Use commas to separate each entry, then the first ':' character will separate the field name from the format. Use '\,' to include a comma in the format. For example: "time:MM:dd:YYYY,time2:YYYY/MM/dd hh:mm:ss" configures fields 'time' and 'time2' as dates with different formats
--geotools-vector.type
- Optional parameter that specifies specific type name(s) from the source file
- Default: []
--gpx.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--gpx.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--gpx.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--tdrive.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--tdrive.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--tdrive.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
-t, --threads
- number of threads to use for ingest, default to 1 (optional)
- Default: 1
--twitter.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--twitter.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--twitter.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
-v, --visibility
- The visibility of the data ingested (optional; default is 'public')

geowave ingest localToHdfs

NAME

geowave ingest localToHdfs - Stage supported files in local file system to HDFS

SYNOPSIS

geowave ingest localToHdfs [options] <file or directory> <hdfs host:port> <path to base directory to write to>

DESCRIPTION

The geowave ingest localToHdfs operator will stage supported files in the local file system to HDFS

OPTIONS

--avro.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--avro.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--avro.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
-x, --extension
- individual or comma-delimited set of file extensions to accept (optional)
-f, --formats
- Explicitly set the ingest formats by name (or multiple comma-delimited formats), if not set all available ingest formats will be used
--gdelt.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--gdelt.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--gdelt.extended
- A flag to indicate whether extended data format should be used
- Default: false
--gdelt.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--geolife.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--geolife.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--geolife.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--geotools-raster.coverage
- Optional parameter to set the coverage name (default is the file name)
--geotools-raster.crs
- A CRS override for the provided raster file
--geotools-raster.histogram
- Build a histogram of samples per band on ingest for performing band equalization
- Default: false
--geotools-raster.mergeStrategy
- Optional parameter to choose a tile merge strategy used for mosaic.
- Default behavior will be none. Alternatively 'no-data' will mosaic the most recent tile over previous tiles, except where there are no data values.
- Default: none
--geotools-raster.nodata
- Optional parameter to set 'no data' values, if 1 value is giving it is applied for each band, if multiple are given then the first totalNoDataValues/totalBands are applied to the first band and so on, so each band can have multiple differing 'no data' values if needed
- Default: []
--geotools-raster.pyramid
- Build an image pyramid on ingest for quick reduced resolution query
- Default: false
--geotools-raster.separateBands
- Optional parameter to separate each band into its own coverage name. By default the coverage name will have '_Bn' appended to it where n is the band’s index.
- Default: false
--geotools-raster.tileSize
- Optional parameter to set the tile size stored (default is 256)
- Default: 256
--geotools-vector.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--geotools-vector.data
- A map of date field names to the date format of the file. Use commas to separate each entry, then the first ':' character will separate the field name from the format. Use '\,' to include a comma in the format. For example: "time:MM:dd:YYYY,time2:YYYY/MM/dd hh:mm:ss" configures fields 'time' and 'time2' as dates with different formats
--geotools-vector.type
- Optional parameter that specifies specific type name(s) from the source file
- Default: []
--gpx.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--gpx.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--gpx.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--tdrive.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--tdrive.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--tdrive.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--twitter.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--twitter.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--twitter.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)

geowave ingest localToKafka

NAME

geowave ingest localToKafka - Stage supported files in local file system to a Kafka topic

SYNOPSIS

geowave ingest localToKafka [options] <file or directory>

DESCRIPTION

The geowave ingest localToKafka operator will stage supported files in the local file system to a Kafka topic

OPTIONS

--avro.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--avro.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--avro.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
-x, --extension
- individual or comma-delimited set of file extensions to accept (optional)
-f, --formats
- Explicitly set the ingest formats by name (or multiple comma-delimited formats), if not set all available ingest formats will be used
--gdelt.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--gdelt.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--gdelt.extended
- A flag to indicate whether extended data format should be used
- Default: false
--gdelt.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--geolife.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--geolife.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--geolife.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--geotools-raster.coverage
- Optional parameter to set the coverage name (default is the file name)
--geotools-raster.crs
- A CRS override for the provided raster file
--geotools-raster.histogram
- Build a histogram of samples per band on ingest for performing band equalization
- Default: false
--geotools-raster.mergeStrategy
- Optional parameter to choose a tile merge strategy used for mosaic.
- Default behavior will be none. Alternatively 'no-data' will mosaic the most recent tile over previous tiles, except where there are no data values.
- Default: none
--geotools-raster.nodata
- Optional parameter to set 'no data' values, if 1 value is giving it is applied for each band, if multiple are given then the first totalNoDataValues/totalBands are applied to the first band and so on, so each band can have multiple differing 'no data' values if needed
- Default: []
--geotools-raster.pyramid
- Build an image pyramid on ingest for quick reduced resolution query
- Default: false
--geotools-raster.separateBands
- Optional parameter to separate each band into its own coverage name. By default the coverage name will have '_Bn' appended to it where n is the band’s index.
- Default: false
--geotools-raster.tileSize
- Optional parameter to set the tile size stored (default is 256)
- Default: 256
--geotools-vector.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--geotools-vector.data
- A map of date field names to the date format of the file. Use commas to separate each entry, then the first ':' character will separate the field name from the format. Use '\,' to include a comma in the format. For example: "time:MM:dd:YYYY,time2:YYYY/MM/dd hh:mm:ss" configures fields 'time' and 'time2' as dates with different formats
--geotools-vector.type
- Optional parameter that specifies specific type name(s) from the source file
- Default: []
--gpx.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--gpx.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--gpx.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
* --kafkaprops
- Properties file containing Kafka properties
--metadataBrokerList
- This is for bootstrapping and the producer will only use it for getting metadata (topics, partitions and replicas). The socket connections for sending the actual data will be established based on the broker information returned in the metadata. The format is host1:port1,host2:port2, and the list can be a subset of brokers or a VIP pointing to a subset of brokers.
--producerType
- This parameter specifies whether the messages are sent asynchronously in a background thread. Valid values are (1) async for asynchronous send and (2) sync for synchronous send. By setting the producer to async we allow batching together of requests (which is great for throughput) but open the possibility of a failure of the client machine dropping unsent data.
--requestRequiredAcks
- This value controls when a produce request is considered completed. Specifically, how many other brokers must have committed the data to their log and acknowledged this to the leader?
--retryBackoffMs
- The amount of time to wait before attempting to retry a failed produce request to a given topic partition. This avoids repeated sending-and-failing in a tight loop.
--serializerClass
- The serializer class for messages. The default encoder takes a byte[] and returns the same byte[].
--tdrive.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--tdrive.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--tdrive.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--twitter.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--twitter.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--twitter.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)

geowave ingest localToMrGW

NAME

geowave ingest localToMrGW - Copy supported files from local file system to HDFS and ingest from HDFS

SYNOPSIS

geowave ingest localToMrGW [options] <file or directory> <hdfs host:port> <path to base directory to write to> <store name> <comma delimited index/group list>

DESCRIPTION

The geowave ingest localToMrGW operator will Copy supported files from local file system to HDFS and ingest from HDFS

OPTIONS

--avro.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--avro.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--avro.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
-x, --extension
- individual or comma-delimited set of file extensions to accept (optional)
-f, --formats
- Explicitly set the ingest formats by name (or multiple comma-delimited formats), if not set all available ingest formats will be used
--gdelt.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--gdelt.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--gdelt.extended
- A flag to indicate whether extended data format should be used
- Default: false
--gdelt.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--geolife.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--geolife.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--geolife.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--geotools-raster.coverage
- Optional parameter to set the coverage name (default is the file name)
--geotools-raster.crs
- A CRS override for the provided raster file
--geotools-raster.histogram
- Build a histogram of samples per band on ingest for performing band equalization
- Default: false
--geotools-raster.mergeStrategy
- Optional parameter to choose a tile merge strategy used for mosaic.
- Default behavior will be none. Alternatively 'no-data' will mosaic the most recent tile over previous tiles, except where there are no data values.
- Default: none
--geotools-raster.nodata
- Optional parameter to set 'no data' values, if 1 value is giving it is applied for each band, if multiple are given then the first totalNoDataValues/totalBands are applied to the first band and so on, so each band can have multiple differing 'no data' values if needed
- Default: []
--geotools-raster.pyramid
- Build an image pyramid on ingest for quick reduced resolution query
- Default: false
--geotools-raster.separateBands
- Optional parameter to separate each band into its own coverage name. By default the coverage name will have '_Bn' appended to it where n is the band’s index.
- Default: false
--geotools-raster.tileSize
- Optional parameter to set the tile size stored (default is 256)
- Default: 256
--geotools-vector.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--geotools-vector.data
- A map of date field names to the date format of the file. Use commas to separate each entry, then the first ':' character will separate the field name from the format. Use '\,' to include a comma in the format. For example: "time:MM:dd:YYYY,time2:YYYY/MM/dd hh:mm:ss" configures fields 'time' and 'time2' as dates with different formats
--geotools-vector.type
- Optional parameter that specifies specific type name(s) from the source file
- Default: []
--gpx.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--gpx.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--gpx.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--jobtracker
- Hadoop job tracker hostname and port in the format hostname:port
--resourceman
- Yarn resource manager hostname and port in the format hostname:port
--tdrive.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--tdrive.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--tdrive.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--twitter.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--twitter.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--twitter.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
-v, --visibility
- The visibility of the data ingested (optional; default is 'public')

geowave ingest mrToGW

NAME

geowave ingest mrToGW - Ingest supported files that already exist in HDFS

SYNOPSIS

geowave ingest mrToGW [options] <hdfs host:port> <path to base directory to write to> <store name> <comma delimited index/group list>

DESCRIPTION

The geowave ingest mrToGW operator will ingest supported files that already exist in HDFS

OPTIONS

--avro.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--avro.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--avro.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
-x, --extension
- individual or comma-delimited set of file extensions to accept (optional)
-f, --formats
- Explicitly set the ingest formats by name (or multiple comma-delimited formats), if not set all available ingest formats will be used
--gdelt.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--gdelt.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--gdelt.extended
- A flag to indicate whether extended data format should be used
- Default: false
--gdelt.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--geolife.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--geolife.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--geolife.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--geotools-raster.coverage
- Optional parameter to set the coverage name (default is the file name)
--geotools-raster.crs
- A CRS override for the provided raster file
--geotools-raster.histogram
- Build a histogram of samples per band on ingest for performing band equalization
- Default: false
--geotools-raster.mergeStrategy
- Optional parameter to choose a tile merge strategy used for mosaic.
- Default behavior will be none. Alternatively 'no-data' will mosaic the most recent tile over previous tiles, except where there are no data values.
- Default: none
--geotools-raster.nodata
- Optional parameter to set 'no data' values, if 1 value is giving it is applied for each band, if multiple are given then the first totalNoDataValues/totalBands are applied to the first band and so on, so each band can have multiple differing 'no data' values if needed
- Default: []
--geotools-raster.pyramid
- Build an image pyramid on ingest for quick reduced resolution query
- Default: false
--geotools-raster.separateBands
- Optional parameter to separate each band into its own coverage name. By default the coverage name will have '_Bn' appended to it where n is the band’s index.
- Default: false
--geotools-raster.tileSize
- Optional parameter to set the tile size stored (default is 256)
- Default: 256
--geotools-vector.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--geotools-vector.data
- A map of date field names to the date format of the file. Use commas to separate each entry, then the first ':' character will separate the field name from the format. Use '\,' to include a comma in the format. For example: "time:MM:dd:YYYY,time2:YYYY/MM/dd hh:mm:ss" configures fields 'time' and 'time2' as dates with different formats
--geotools-vector.type
- Optional parameter that specifies specific type name(s) from the source file
- Default: []
--gpx.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--gpx.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--gpx.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--jobtracker
- Hadoop job tracker hostname and port in the format hostname:port
--resourceman
- Yarn resource manager hostname and port in the format hostname:port
--tdrive.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--tdrive.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--tdrive.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
--twitter.avro
- A flag to indicate whether avro feature serialization should be used
- Default: false
--twitter.cql
- A CQL filter, only data matching this filter will be ingested
- Default: <empty string>
--twitter.typename
- A comma-delimitted set of typenames to ingest, feature types matching the specified typenames will be ingested (optional, by default all types will be ingested)
-v, --visibility
- The visibility of the data ingested (optional; default is 'public')

geowave ingest sparkToGW

NAME

geowave ingest sparkToGW - Ingest supported files that already exist in HDFS or S3

SYNOPSIS

geowave ingest sparkToGW [options] <input directory> <store name> <comma delimited index/group list>

DESCRIPTION

Ingest supported files that already exist in HDFS or S3

OPTIONS

-x, --extension
- individual or comma-delimited set of file extensions to accept (optional)
-f, --formats
- Explicitly set the ingest formats by name (or multiple comma-delimited formats), if not set all available ingest formats will be used
-ho, --hosts
- The spark driver host Default: localhost
-m, --master
- The spark master designation Default: local
-n, --name
- The spark application name Default: Spark Ingest
-c, --numcores
- Number of cores Default: -1
-e, --numexecutors
- Number of executors Default: -1
-v, --visibility
- The visibility of the data ingested (optional; default is 'public')

Remote Commands

Operations to manage a remote store (Required options are designated with an *)

Commands:

Calcstat
Clear
Copy
ListAdapter
ListIndex
ListStats
MergeData
ReCalcStats
RmStat
Version

geowave remote calcstat

NAME

geowave remote calcstat - Calculate a specific statistic in the remote store, given adapter ID and statistic ID

SYNOPSIS

geowave remote calcstat [options] <store name> <adapterId> <statId>

DESCRIPTION

The geowave remote calcstat operator will calculate a specific statistic in the remote store, given adapter ID and statistic ID.

OPTIONS

--auth
- The authorizations used for the statistics calculation as a subset of the accumulo user authorization; by default all authorizations are used.
--json
- Output in JSON format.
- Default: false

geowave remote clear

NAME

geowave remote clear - Clear ALL data from a GeoWave store and delete tables

SYNOPSIS

geowave remote clear [options] <store name>

DESCRIPTION

The geowave remote clear operator will clear ALL data from a GeoWave store and delete tables

OPTIONS

There are currently no options for this command

geowave remote copy

NAME

geowave remote copy - Copy a data store

SYNOPSIS

geowave remote copy [options] <input store name> <output store name>

DESCRIPTION

This command will allow a user to copy a data store

OPTIONS

* --hdfsHostPort
- The hdfs host port
* --jobSubmissionHostPort
- The job submission tracker
--maxSplits
- The max partitions for the input data
--minSplits
- The min partitions for the input data
--numReducers
- Number of threads writing at a time (default: 8)
- Default: 8

geowave remote listadapter

NAME

geowave remote listadapter - display all adapters in this remote store

SYNOPSIS

geowave remote listadapter [options] <store name>

DESCRIPTION

The geowave remote listadapter operator will display all adapters in this remote store.

OPTIONS

There are currently no options for this command

geowave remote listindex

NAME

geowave remote listindex - Display all indices in this remote store

SYNOPSIS

geowave remote listindex [options] <store name>

DESCRIPTION

The geowave remote listindex operator will display all indices in a specific remote store

OPTIONS

There are currently no options for this command

geowave remote liststats

NAME

geowave remote liststats - Print statistics of an existing GeoWave dataset to standard output

SYNOPSIS

geowave remote liststats [options] <store name> [<adapter name>]

DESCRIPTION

The geowave remote liststats operator will print statistics of an existing GeoWave dataset to standard output

OPTIONS

--auth
- The authorizations used for the statistics calculation as a subset of the accumulo user authorization; by default all authorizations are used.
--json
- Output in JSON format.
- Default: false

geowave remote mergedata

NAME

geowave remote mergedata - Merge all rows for a given adapter and index

SYNOPSIS

geowave remote mergedata [options] <storename> <indexname>

DESCRIPTION

This command will allow a user to merge all rows for a given adapter and index

OPTIONS

There are currently no options for this command

geowave remote recalcstats

NAME

geowave remote recalcstats - Calculate the statistics of an existing GeoWave dataset

SYNOPSIS

geowave remote recalcstats [options] <store name> [<adapter name>]

DESCRIPTION

The geowave remote recalcstats operator will calculate the statistics of an existing GeoWave dataset

OPTIONS

--auth
- The authorizations used for the statistics calculation as a subset of the accumulo user authorization; by default all authorizations are used.
--json
- Output in JSON format.
- Default: false

geowave remote rmstat

NAME

geowave remote rmstat - Remove a statistic from the remote store

SYNOPSIS

geowave remote rmstat [options] <store name> <adapterId> <statId>

DESCRIPTION

The geowave remote rmstat operator will Remove a statistic from the remote store. You will be prompted with "Are you sure?"

OPTIONS

--auth
- The authorizations used for the statistics calculation as a subset of the accumulo user authorization; by default all authorizations are used.
--json
- Output in JSON format.
- Default: false

geowave remote version

NAME

geowave remote version - Get the version of GeoWave running on the instance of a remote datastore

SYNOPSIS

geowave remote version [options] <storename>

DESCRIPTION

This command will return the version of GeoWave running on the instance of a remote datastore.

OPTIONS

There are currently no options for this command

Analytic Commands

Commands that run mapreduce or spark processing to enhance an existing GeoWave dataset (Required options are designated with an *)

Commands:

DBScan
KDE
KMeansJump
KMeansParallel
KMeansSpark
NearestNeighbor
SQL

The commands below can also be run as a yarn or hadoop api commands (i.e. mapreduce)

For instance, if running the analytic using yarn:

yarn jar geowave-tools.jar analytic <algorithm> <options> <store>

geowave analytic dbscan

NAME

geowave analytic dbscan - Density Based Scanner

SYNOPSIS

geowave analytic dbscan [options] <storename>

DESCRIPTION

The geowave analytic dbscan operator will run a density based scanner analytic on GeoWave data

EXAMPLE

yarn jar geowave-tools.jar analytic dbscan -cmi 5 -cms 10 -emn 2 -emx 6 -pmd 1000 -orc 4 -hdfs localhost:53000 -jobtracker localhost:8032 -hdfsbase /user/rwgdrummer --query.adapters gpxpoint my_store

Run through 5 max iterations (cmi), with max distance between points as 10 meters (cms), min hdfs input split is 2 (emn), max hdfs input split is 6 (emx), max search distance is 1000 meters (pmd), reducer count is 4 (orc), the hdfs ipc port is localhost:53000 (hdfs), the yarn job tracker is at localhost:8032 (-jobtracker), the temporary files needed by this job are stored in hdfs:/host:port//user/rwgdrummer (hdfsbase), and the data executed against DBSCAN is 'gpxpoint' (query.adapters). The accumulo connection parameters are loaded from my_store.

EXECUTION

DBSCAN uses GeoWaveInputFormat to load data from GeoWave into HDFS. You can use the extract query parameter to limit the records used in the analytic.

It iteratively calls Nearest Neighbor to execute a sequence of concave hulls. The hulls are saved into sequence files written to a temporary HDFS directory, and then read in again for the next DBSCAN iteration.

After completion, the data is written back from HDFS to Accumulo using a job called the "input load runner".

OPTIONS

* -cmi, --clusteringMaxIterations
- Maximum number of iterations when finding optimal clusters
* -cms, --clusteringMinimumSize
- Minimum Cluster Size
--cdf, --commonDistanceFunctionClass
- Distance Function Class implements mil.nga.giat.geowave.analytics.distance.DistanceFn
* -emx, --extractMaxInputSplit
- Maximum hdfs input split size
* -emn, --extractMinInputSplit
- Minimum hdfs input split size
-eq, --extractQuery
- Query
-b, --globalBatchId
- Batch ID
-hdt, --hullDataTypeId
- Data Type ID for a centroid item
-hpe, --hullProjectionClass
- Class to project on to 2D space. Implements mil.nga.giat.geowave.analytics.tools.Projection
-ifc, --inputFormatClass
- Input Format Class
-conf, --mapReduceConfigFile
- MapReduce Configuration
* -hdfsbase, --mapReduceHdfsBaseDir
- Fully qualified path to the base directory in hdfs
* -hdfs, --mapReduceHdfsHostPort
- HDFS hostname and port in the format hostname:port
-jobtracker, --mapReduceJobtrackerHostPort
- [REQUIRED (or resourceman)] Hadoop job tracker hostname and port in the format hostname:port
-resourceman, --mapReduceYarnResourceManager
- [REQUIRED (or jobtracker)] Yarn resource manager hostname and port in the format hostname:port
-ons, --outputDataNamespaceUri
- Output namespace for objects that will be written to GeoWave
-odt, --outputDataTypeId
- Output Data ID assigned to objects that will be written to GeoWave
-oop, --outputHdfsOutputPath
- Output HDFS File Path
-oid, --outputIndexId
- Output Index ID for objects that will be written to GeoWave
-ofc, --outputOutputFormat
- Output Format Class
-orc, --outputReducerCount
- Number of Reducers For Output
-pdt, --partitionDistanceThresholds
- Comma separated list of distance thresholds, per dimension
-pdu, --partitionGeometricDistanceUnit
- Geometric distance unit (m=meters,km=kilometers, see symbols for javax.units.BaseUnit)
* -pmd, --partitionMaxDistance
- Maximum Partition Distance
-pms, --partitionMaxMemberSelection
- Maximum number of members selected from a partition
-pdr, --partitionPartitionDecreaseRate
- Rate of decrease for precision(within (0,1])
-pp, --partitionPartitionPrecision
- Partition Precision
-pc, --partitionPartitionerClass
- Index Identifier for Centroids
-psp, --partitionSecondaryPartitionerClass
- Perform secondary partitioning with the provided class
* --query.adapters
- The comma-separated list of data adapters to query; by default all adapters are used.
--query.auth
- The comma-separated list of authorizations used during extract; by default all authorizations are used.
--query.index
- The specific index to query; by default one is chosen for each adapter.

geowave analytic kde

NAME

geowave analytic kde - Kernel Density Estimate

SYNOPSIS

geowave analytic kde [options] <input storename> <output storename>

DESCRIPTION

The geowave analytic kde operator will run a Kernel Density Estimate analytic on GeoWave data

OPTIONS

* --coverageName
- The coverage name
--cqlFilter
- An optional CQL filter applied to the input data
* --featureType
- The name of the feature type to run a KDE on
* --hdfsHostPort
- The hdfs host port
--indexId
- An optional index ID to filter the input data
* --jobSubmissionHostPort
- The job submission tracker
* --maxLevel
- The max level to run a KDE at
—maxSplits
- The max partitions for the input data
* --minLevel
- The min level to run a KDE at
--minSplits
- The min partitions for the input data
--tileSize
- The tile size
- Default: 1

geowave analytic kmeansjump

NAME

geowave analytic kmeansjump - KMeans Clustering using Jump Method

SYNOPSIS

geowave analytic kmeansjump [options] <storename>

DESCRIPTION

The geowave analytic kmeansjump operator will execute a KMeans Clustering analytic using a Jump Method

EXAMPLE

yarn jar geowave-tools.jar analytic kmeansjump -cmi 15 -zl 1 -emx 4000 -emn 100 -hdfsbase /usr/rwgdrummer/temp_dir_kmeans -hdfs localhost:53000 -jobtracker localhost:8032 --query.adapters hail -jkp 3 -jrc 4,8 my_store

The min clustering iterations is 15 (cmi), the zoomlevel is 1 (zl), the max hdfs input split is 4000 (emx), the min hdfs input split is 100 (emn), the temporary files needed by this job are stored in hdfs:/host:port/user/rwgdrummer/temp_dir_kmeans (hdfsbase), the hdfs ipc port is localhost:53000 (hdfs), the yarn job tracker is at localhost:8032 (-jobtracker), the data executed against is 'hail' (query.adapters), the min k for kmeans parallel sampling is 3 (jkp), the comma separated range of centroids is 4,8 (jrc). The accumulo connection parameters are loaded from my_store.

EXECUTION

KMeansJump uses most of the same parameters from KMeansParallel. It tries every k value given (-jrc) to find the value with least entropy. The other value, jkp, will specify which k values should use kmeans parallel for sampling versus a single sampler (which uses a random sample). For instance, if you specify 4,8 for jrc and 6 for jkp, then k=4,5 will use the kmeansparallel sampler, while 6,7,8 will use the single sampler.

KMeansJump executes by executing several iterations, running the sampler (described above, which also calls the normal k-means algorithm to determine centroids) and then executing a KMeans distortion job, which calculates the entropy of the calculated centroids.

Look at the "EXECUTION" documentation for kmeansparallel operation for discussion of output, tolerance and performance variables.

OPTIONS

-cce, --centroidExtractorClass
- Centroid Exractor Class implements mil.nga.giat.geowave.analytics.extract.CentroidExtractor
-cid, --centroidIndexId
- Index Identifier for Centroids
-cfc, --centroidWrapperFactoryClass
- A factory class that implements mil.nga.giat.geowave.analytics.tools.AnalyticItemWrapperFactory
-czl, --centroidZoomLevel
- Zoom Level Number
-cct, --clusteringConverganceTolerance
- Convergence Tolerance
* -cmi, --clusteringMaxIterations
- Maximum number of iterations when finding optimal clusters
-crc, --clusteringMaxReducerCount
- Maximum Clustering Reducer Count
* -zl, --clusteringZoomLevels
- Number of Zoom Levels to Process
-dde, --commonDimensionExtractClass
- Dimension Extractor Class implements mil.nga.giat.geowave.analytics.extract.DimensionExtractor
-cdf, --commonDistanceFunctionClass
- Distance Function Class implements mil.nga.giat.geowave.analytics.distance.DistanceFn
-ens, --extractDataNamespaceUri
- Output Data Namespace URI
-ede, --extractDimensionExtractClass
- Class to extract dimensions into a simple feature output
* -emx, --extractMaxInputSplit
- Maximum hdfs input split size
* -emn, --extractMinInputSplit
- Minimum hdfs input split size
-eot, --extractOutputDataTypeId
- Output Data Type ID
-eq, --extractQuery
- Query
-erc, --extractReducerCount
- Number of Reducers For initial data extraction and de-duplication
-b, --globalBatchId
- Batch ID
-pb, --globalParentBatchId
- Batch ID
-hns, --hullDataNamespaceUri
- Data Type Namespace for a centroid item
-hdt, --hullDataTypeId
- Data Type ID for a centroid item
-hid, --hullIndexId
- Index Identifier for Centroids
-hpe, --hullProjectionClass
- Class to project on to 2D space. Implements mil.nga.giat.geowave.analytics.tools.Projection
-hrc, --hullReducerCount
- Centroid Reducer Count
-hfc, --hullWrapperFactoryClass
- Class to create analytic item to capture hulls. Implements mil.nga.giat.geowave.analytics.tools.AnalyticItemWrapperFactory
-ifc, --inputFormatClass
- Input Format Class
* -jkp, --jumpKplusplusMin
- The minimum k when K means ++ takes over sampling.
* -jrc, --jumpRangeOfCentroids
- Comma-separated range of centroids (e.g. 2,100)
-conf, --mapReduceConfigFile
- MapReduce Configuration
* -hdfsbase, --mapReduceHdfsBaseDir
- Fully qualified path to the base directory in hdfs
* -hdfs, --mapReduceHdfsHostPort
- HDFS hostname and port in the format hostname:port
-jobtracker, --mapReduceJobtrackerHostPort
- [REQUIRED (or resourceman)] Hadoop job tracker hostname and port in the format hostname:port
-resourceman, --mapReduceYarnResourceManager
- [REQUIRED (or jobtracker)] Yarn resource manager hostname and port in the format hostname:port
-ofc, --outputOutputFormat
- Output Format Class
-orc, --outputReducerCount
- Number of Reducers For Output
* --query.adapters
- The comma-separated list of data adapters to query; by default all adapters are used.
--query.auth
- The comma-separated list of authorizations used during extract; by default all authorizations are used.
--query.index
- The specific index to query; by default one is chosen for each adapter.

geowave analytic kmeansparallel

NAME

geowave analytic kmeansparallel - KMeans Parallel Clustering

SYNOPSIS

geowave analytic kmeansparallel [options] <storename>

DESCRIPTION

The geowave analytic kmeansparallel operator will execute a KMeans Parallel Clustering analytic

EXAMPLE

yarn jar geowave-tools.jar analytic kmeansparallel -cmi 15 -zl 1 -emx 4000 -emn 100 -hdfsbase /usr/rwgdrummer/temp_dir_kmeans
-hdfs localhost:53000 -jobtracker localhost:8032 --query.adapters hail -sms 4 -sxs 8 -ssi 10 my_store

The min clustering iterations is 15 (cmi), the zoomlevel is 1 (zl), the max hdfs input split is 4000 (emx), the min hdfs input split is 100 (emn), the temporary files needed by this job are stored in hdfs:/host:port/user/rwgdrummer/temp_dir_kmeans (hdfsbase), the hdfs ipc port is localhost:53000 (hdfs), the yarn job tracker is at localhost:8032 (-jobtracker), the data executed against is 'hail' (query.adapters), the min sample size is 4 (sms, which is kmin), the max sample size is 8 (which is kmax), the minimum number of sampling iterations is 10. The accumulo connection parameters are loaded from my_store.

EXECUTION

KMeansParallel tries to identify the optimal k (sms, smx) for a set of zoom levels (1 → zl). When zoom level is 1, it will perform a normal kmeans and find k clusters. If zoomlevel is 2 or higher, it will take each cluster found, and then try to create sub-clusters (bounded by that cluster), identifying a new optimal k for that sub-cluster. As such, without powerful infrastucture, this approach could take a significant amount of time to complete with zoom levels higher than 1.

KMeansParallel executes by first executing an extraction and de-duplication on data received via GeoWaveInputFormat. The data is copied to HDFS for faster processing. The K-Sampler job is used to pick sample centroid points. These centroids are then assigned a cost, and then weak centroids are stripped before the K-Sampler is executed again. This process iterates several times, before the best centroid locations are found, which are fed into the real K-Means algorithm as initial guesses. K-Means iterates until the tolerance is reached (-cct, which defaults to 0.0001) or the max iterations is met (-cmi).

After execution, KMeansParallel writes the centroids to an output data type (-eot, defaults to centroid), and then creates an informational set of convex hulls which you can plot in GeoServer to visually identify cluster groups (-hdt, defaults to convex_hull).

For tuning performance, you can set the number of reducers used in each step. Extraction/Dedupe reducer count is -crc, Clustering reducer count is -erc, Convex Hull reducer count is -hrc, and Output reducer count is -orc).

If you would like to run the algorithm multiple times, it may be useful to set the batch id (-b), which can be used to distinguish between multiple batches (runs).

OPTIONS

-cce, --centroidExtractorClass
- Centroid Exractor Class implements mil.nga.giat.geowave.analytics.extract.CentroidExtractor
-cid, --centroidIndexId
- Index Identifier for Centroids
-cfc, --centroidWrapperFactoryClass
- A factory class that implements mil.nga.giat.geowave.analytics.tools.AnalyticItemWrapperFactory
-czl, --centroidZoomLevel
- Zoom Level Number
-cct, --clusteringConverganceTolerance
- Convergence Tolerance
* -cmi, --clusteringMaxIterations
- Maximum number of iterations when finding optimal clusters
-crc, --clusteringMaxReducerCount
- Maximum Clustering Reducer Count
* -zl, --clusteringZoomLevels
- Number of Zoom Levels to Process
-dde, --commonDimensionExtractClass
- Dimension Extractor Class implements mil.nga.giat.geowave.analytics.extract.DimensionExtractor
-cdf, --commonDistanceFunctionClass
- Distance Function Class implements mil.nga.giat.geowave.analytics.distance.DistanceFn
-ens, --extractDataNamespaceUri
- Output Data Namespace URI
-ede, --extractDimensionExtractClass
- Class to extract dimensions into a simple feature output
* -emx, --extractMaxInputSplit
- Maximum hdfs input split size
* -emn, --extractMinInputSplit
- Minimum hdfs input split size
-eot, --extractOutputDataTypeId
- Output Data Type ID
-eq, --extractQuery
- Query
-erc, --extractReducerCount
- Number of Reducers For initial data extraction and de-duplication
-b, --globalBatchId
- Batch ID
-pb, --globalParentBatchId
- Batch ID
-hns, --hullDataNamespaceUri
- Data Type Namespace for a centroid item
-hdt, --hullDataTypeId
- Data Type ID for a centroid item
-hid, --hullIndexId
- Index Identifier for Centroids
-hpe, --hullProjectionClass
- Class to project on to 2D space. Implements mil.nga.giat.geowave.analytics.tools.Projection
-hrc, --hullReducerCount
- Centroid Reducer Count
-hfc, --hullWrapperFactoryClass
- Class to create analytic item to capture hulls. Implements mil.nga.giat.geowave.analytics.tools.AnalyticItemWrapperFactory
-ifc, --inputFormatClass
- Input Format Class
-conf, --mapReduceConfigFile
- MapReduce Configuration
* -hdfsbase, --mapReduceHdfsBaseDir
- Fully qualified path to the base directory in hdfs
* -hdfs, --mapReduceHdfsHostPort
- HDFS hostname and port in the format hostname:port
-jobtracker, --mapReduceJobtrackerHostPort
- [REQUIRED (or resourceman)] Hadoop job tracker hostname and port in the format hostname:port
-resourceman, --mapReduceYarnResourceManager
- [REQUIRED (or jobtracker)] Yarn resource manager hostname and port in the format hostname:port
-ofc, --outputOutputFormat
- Output Format Class
-orc, --outputReducerCount
- Number of Reducers For Output
* --query.adapters
- The comma-separated list of data adapters to query; by default all adapters are used.
--query.auth
- The comma-separated list of authorizations used during extract; by default all authorizations are used.
--query.index
- The specific index to query; by default one is chosen for each adapter.
* -sxs, --sampleMaxSampleSize
- Max Sample Size
* -sms, --sampleMinSampleSize
- Minimum Sample Size
* -ssi, --sampleSampleIterations
- Minimum number of sample iterations

geowave analytic nn

NAME

geowave analytic nn - Nearest Neighbors

SYNOPSIS

geowave analytic nn [options] <storename>

DESCRIPTION

The geowave analytic nn operator will execute a Nearest Neighbors analytic. Analytic 'nn' is similar to DBScan, with less arguments. 'nn' just dumps all near neighbors for every feature id in a list of pairs. Most developers will want to extend the framework to add their own extensions.

EXAMPLE

yarn jar geowave-tools.jar analytic nn -emn 2 -emx 6 -pmd 1000 -oop /user/rwgdrummer_out -orc 4 -hdfs localhost:53000 -jobtracker localhost:8032 -hdfsbase /user/rwgdrummer --query.adapters gpxpoint my_store

The min hdfs input split is 2 (emn), max hdfs input split is 6 (emx), max search distance is 1000 meters (pmd), the sequence file output directory is hdfs://host:port/user/rwgdrummer_out, reducer count is 4 (orc), the hdfs ipc port is localhost:53000 (hdfs), the yarn job tracker is at localhost:8032 (-jobtracker), the temporary files needed by this job are stored in hdfs:/host:port//user/rwgdrummer (hdfsbase), and the data executed against is 'gpxpoint' (query.adapters). The accumulo connection parameters are loaded from my_store.

EXECUTION

To execute nearest neighbor search in GeoWave, we use the concept of a "partitioner" to partition all data on the hilbert curve into square segments for the purposes of parallelizing the search.

The default partitioner will multiply this value by 2 and use that for the actual partition sizes. Because of this, the terminology is a bit confusing, but the "pmd" option is actually the most important variable here, describing the max distance for a point to be considered a neighbor to another point.

OPTIONS

-cdf, --commonDistanceFunctionClass
- Distance Function Class implements mil.nga.giat.geowave.analytics.distance.DistanceFn
* -emx, --extractMaxInputSplit
- Maximum hdfs input split size
* -emn, --extractMinInputSplit
- Minimum hdfs input split size
-eq, --extractQuery
- Query
-ifc, --inputFormatClass
- Input Format Class
-conf, --mapReduceConfigFile
- MapReduce Configuration
* -hdfsbase, --mapReduceHdfsBaseDir
- Fully qualified path to the base directory in hdfs
* -hdfs, --mapReduceHdfsHostPort
- HDFS hostname and port in the format hostname:port
-jobtracker, --mapReduceJobtrackerHostPort
- [REQUIRED (or resourceman)] Hadoop job tracker hostname and port in the format hostname:port
-resourceman, --mapReduceYarnResourceManager
- [REQUIRED (or jobtracker)] Yarn resource manager hostname and port in the format hostname:port
* -oop, --outputHdfsOutputPath
- Output HDFS File Path
-ofc, --outputOutputFormat
- Output Format Class
-orc, --outputReducerCount
- Number of Reducers For Output
-pdt, --partitionDistanceThresholds
- Comma separated list of distance thresholds, per dimension
-pdu, --partitionGeometricDistanceUnit
- Geometric distance unit (m=meters,km=kilometers, see symbols for javax.units.BaseUnit)
* -pmd, --partitionMaxDistance
- Maximum Partition Distance
-pms, --partitionMaxMemberSelection
- Maximum number of members selected from a partition
-pp, --partitionPartitionPrecision
- Partition Precision
-pc, --partitionPartitionerClass
- Index Identifier for Centroids
-psp, --partitionSecondaryPartitionerClass
- Perform secondary partitioning with the provided class
* --query.adapters
- The comma-separated list of data adapters to query; by default all adapters are used.
--query.auth
- The comma-separated list of authorizations used during extract; by default all authorizations are used.
--query.index
- The specific index to query; by default one is chosen for each adapter.

geowave analytic sql

NAME

geowave analytic sql - SparkSQL queries

SYNOPSIS

geowave analytic sql [options] <sql query> - e.g. 'select * from storename[|adaptername] where condition…'

DESCRIPTION

The geowave analytic sql operator will execute a SparkSQL query

OPTIONS

--csv
- The output CSV file name
--out
- The output datastore name
--outtype
- The output feature type (adapter) name
-s, --show
- Number of result rows to display
- Default: 20

Accumulo Commands

Utility operations to set accumulo splits and run a test server (Required options are designated with an *)

Commands:

PreSplitPartitionId
RunServer
SplitEqualInterval
SplitNumRecords
SplitQuantile

geowave accumulo dbscan

NAME

geowave accumulo presplitpartitionid - Pre-split Accumulo table by providing the number of partition IDs

SYNOPSIS

geowave accumulo presplitpartitionid [options] <storename>

DESCRIPTION

This command will pre-split an accumulo table by providing the number of partition IDs

OPTIONS

--indexId
- The geowave index ID (optional; default is all indices)
--num
- The number of partitions (or entries)
- Default: 0

geowave accumulo runserver

NAME

geowave accumulo runserver - Runs a standalone mini Accumulo server for test and debug with GeoWave

SYNOPSIS

geowave accumulo runserver [options]

DESCRIPTION

This command will run a standalone mini single-node accumulo server, which can be used locally for testing and debugging GeoWave, without needing to stand up an entire cluster.

OPTIONS

There are currently no options available for this command

geowave accumulo splitequalinterval

NAME

geowave accumulo splitequalinterval - Set Accumulo splits by providing the number of partitions based on an equal interval strategy

SYNOPSIS

geowave accumulo splitequalinterval [options] <storename>

DESCRIPTION

This command will allow a user to set the accumulated splits through providing the number of partitions based on an equal interval strategy.

OPTIONS

--indexId
- The geowave index ID (optional; default is all indices)
--num
- The number of partitions (or entries)
- Default: 0

geowave accumulo splitnumrecords

NAME

geowave accumulo splitnumrecords - Set Accumulo splits by providing the number of entries per split

SYNOPSIS

geowave accumulo splitnumrecords [options] <storename>

DESCRIPTION

This command will set the accumulo datastore splits by providing the number of entries per split.

OPTIONS

--indexId
- The geowave index ID (optional; default is all indices)
--num
- The number of partitions (or entries)
- Default: 0

geowave accumulo splitquantile

NAME

geowave accumulo splitquantile - Set Accumulo splits by providing the number of partitions based on a quantile distribution strategy

SYNOPSIS

geowave accumulo splitquantile [options] <storename>

DESCRIPTION

This command will allow a user to set the accumulo datastore splits by providing the number of partitions based on a quantile distribution strategy.

OPTIONS

--indexId
- The geowave index ID (optional; default is all indices)
--num
- The number of partitions (or entries)
- Default: 0

GeoServer Commands

Commands that manage geoserver data stores and layers (Required options are designated with an *)

Commands:

AddCS
AddCV
AddDS
AddFL
AddLayer
AddStyle
AddWS
GetCS
GetCV
GetDS
GetFL
GetSA
GetStyle
ListCS
ListCV
ListDS
ListFL
ListStyles
ListWS
RmCS
RmCV
RmDS
RmFL
RmStyle
RmWS
SetLS

geowave gs addcs

NAME

geowave gs addcs - Add a GeoServer coverage store

SYNOPSIS

geowave gs addcs [options] <GeoWave store name>

DESCRIPTION

This command will add a GeoServer coverage store

OPTIONS

-cs, --coverageStore
- <coverage store name>
-histo, --equalizeHistogramOverride
- This parameter will override the behavior to always perform histogram equalization if a histogram exists.
- Valid values are true and false.
-interp, --interpolationOverride
- This will override the default interpolation stored for each layer.
- Valid values are 0, 1, 2, 3 for NearestNeighbor, Bilinear, Bicubic, and Bicubic (polynomial variant) resepctively.
-scale, --scaleTo8Bit
- By default, integer values will automatically be scaled to 8-bit and floating point values will not. This can be overridden setting this value to true or false.
-ws, --workspace
- <workspace name>

geowave gs addcvs

NAME

geowave gs addcv - Add a GeoServer coverage

SYNOPSIS

geowave gs addcv [options] <coverage name>

DESCRIPTION

This command will add a GeoServer coverage

OPTIONS

* -cs, --cvgstore
- <coverage store name>
-ws, --workspace
- <workspace name>

geowave gs addds

NAME

geowave gs addds - Add a GeoServer datastore

SYNOPSIS

geowave gs addds [options] <GeoWave store name>

DESCRIPTION

This command will add a GeoServer datastore

OPTIONS

-ds, --datastore
- <datastore name>
-ws, --workspace
- <workspace name>

geowave gs addfl

NAME

geowave gs addfl - Add a GeoServer feature layer

SYNOPSIS

geowave gs addfl [options] <layer name>

DESCRIPTION

This command will add a GeoServer feature layer

OPTIONS

* -ds, --datastore
- <datastore name>
-ws, --workspace
- <workspace name>

geowave gs addlayer

NAME

geowave gs addlayer - Add a GeoServer layer from the given GeoWave store

SYNOPSIS

geowave gs addlayer [options] <GeoWave store name>

DESCRIPTION

This command will add a GeoServer layer from the given GeoWave store

OPTIONS

-id, --adapterId
- select just <adapter id> from the store
-a, --add
- For multiple layers, add (all | raster | vector)
- Possible Values: [ALL, RASTER, VECTOR]
-sld, --setStyle
- <default style sld>
-ws, --workspace
- <workspace name>

geowave gs addstyle

NAME

geowave gs addstyle - Add a GeoServer style

SYNOPSIS

geowave gs addstyle [options] <GeoWave style name>

DESCRIPTION

This command will add a GeoServer style

OPTIONS

* -sld, --stylesld
- <style sld file>

geowave gs addws

NAME

geowave gs addws - Add GeoServer workspace

SYNOPSIS

geowave gs addws [options] <workspace name>

DESCRIPTION

This command will add a GeoServer workspace

OPTIONS

There are currently no options available for this command

geowave gs getcs

NAME

geowave gs getcs - Get GeoServer CoverageStore info

SYNOPSIS

geowave gs getcs [options] <coverage store name>

DESCRIPTION

This command will return GeoServer CoverageStore info

OPTIONS

-ws, --workspace
- <workspace name>

geowave gs getcv

NAME

geowave gs getcv - Get a GeoServer coverage’s info

SYNOPSIS

geowave gs getcv [options] <coverage name>

DESCRIPTION

This command will return a GeoServer coverage’s info

OPTIONS

-cs, --coverageStore
- <coverage store name>
-ws, --workspace
- <workspace name>

geowave gs getds

NAME

geowave gs getds - Get GeoServer DataStore info

SYNOPSIS

geowave gs getds [options] <datastore name>

DESCRIPTION

This command will return a GeoServer DataStore info

OPTIONS

-ws, --workspace
- <workspace name>

geowave gs getfl

NAME

geowave gs getfl - Get GeoServer feature layer info

SYNOPSIS

geowave gs getfl [options] <layer name>

DESCRIPTION

This command will return GeoServer feature layer info

OPTIONS

There are currently no options available for this command

geowave gs getsa

NAME

geowave config getsa - Get GeoWave store adapters

SYNOPSIS

geowave gs getsa [options] <store name>

DESCRIPTION

This command will return GeoWave store adapters

OPTIONS

There are currently no options available for this command

geowave gs getstyle

NAME

geowave gs getstyle - Get GeoServer Style info

SYNOPSIS

geowave gs getstyle [options] <style name>

DESCRIPTION

This command will return GeoServer Style info

OPTIONS

There are currently no options available for this command

geowave gs listcs

NAME

geowave gs listcs - List GeoServer coverage stores

SYNOPSIS

geowave gs listcs [options]

DESCRIPTION

This command will list all GeoServer coverage stores

OPTIONS

-ws, --workspace
- <workspace name>

geowave gs listcv

NAME

geowave gs listcv - List GeoServer Coverages

SYNOPSIS

geowave gs listcv [options] <coverage store name>

DESCRIPTION

This command will list all GeoServer Coverages

OPTIONS

-ws, --workspace
- <workspace name>

geowave gs listds

NAME

geowave gs listds - List GeoServer datastores

SYNOPSIS

geowave gs listds [options]

DESCRIPTION

This command will list all GeoServer datastores

OPTIONS

-ws, --workspace
- <workspace name>

geowave gs listfl

NAME

geowave gs listfl - List GeoServer feature layers

SYNOPSIS

geowave gs listfl [options]

DESCRIPTION

This command will list all GeoServer feature layers

OPTIONS

-ds, --datastore
- Datastore Name
-g, --geowaveOnly
- Show only GeoWave feature layers (default: false)
- Default: false
-ws, --workspace
- Workspace Name

geowave gs liststyles

NAME

geowave gs liststyles - List GeoServer styles

SYNOPSIS

geowave gs liststyles [options]

DESCRIPTION

This command will list all GeoServer styles

OPTIONS

There are currently no options available for this command

geowave gs listws

NAME

geowave gs listws - List GeoServer workspaces

SYNOPSIS

geowave gs listws [options]

DESCRIPTION

This command will list all GeoServer workspaces

OPTIONS

There are currently no options available for this command

geowave gs rmcs

NAME

geowave gs rmcs - Remove GeoServer Coverage Store

SYNOPSIS

geowave gs rmcs [options] <coverage store name>

DESCRIPTION

This command will remove a GeoServer Coverage Store

OPTIONS

-ws, --workspace
- Workspace Name

geowave gs rmcv

NAME

geowave gs rmcv - Remove a GeoServer coverage

SYNOPSIS

geowave gs rmcv [options] <coverage name>

DESCRIPTION

This command will remove a GeoServer coverage

OPTIONS

* -cs, --cvgstore
- <coverage store name>
-ws, --workspace
- <workspace name>

geowave gs rmds

NAME

geowave gs rmds - Remove GeoServer DataStore

SYNOPSIS

geowave gs rmds [options] <datastore name>

DESCRIPTION

This command will remove a GeoServer DataStore

OPTIONS

-ws, --workspace
- <workspace name>

geowave gs rmfl

NAME

geowave gs rmfl - Remove GeoServer feature Layer

SYNOPSIS

geowave gs rmfl [options] <layer name>

DESCRIPTION

This command will remove a GeoServer feature Layer

OPTIONS

There are currently no options available for this command

geowave gs rmstyle

NAME

geowave gs rmstyle - Remove GeoServer Style

SYNOPSIS

geowave gs rmstyle [options] <style name>

DESCRIPTION

This command will remove a GeoServer Style

OPTIONS

There are currently no options available for this command

geowave gs rmws

NAME

geowave gs rmws - Remove GeoServer workspace

SYNOPSIS

geowave gs rmws [options] <workspace name>

DESCRIPTION

This command will remove a GeoServer workspace

OPTIONS

There are currently no options available for this command

geowave gs setls

NAME

geowave gs setls - Set GeoServer Layer Style

SYNOPSIS

geowave gs setls [options] <layer name>

DESCRIPTION

This command will set a GeoServer layer style

OPTIONS

* -sn, --styleName
- <style name>

HBase Commands

Utility operations to combine statistics in hbase (Required options are designated with an *)

Commands:

CombineStats
RunServer

geowave hbase combinestats

NAME

geowave hbase combinestats - Combine all statistics in HBase namespace

SYNOPSIS

geowave hbase combinestats [options] <storename> <adapter id>

DESCRIPTION

This command will combine all statistics in an HBase namespace

OPTIONS

There are currently no options for this command

geowave hbase runserver

NAME

geowave hbase runserver - Runs a standalone mini HBase server for test and debug with GeoWave

SYNOPSIS

geowave hbase runserver [options]

DESCRIPTION

This command will run a standalone mini single-node HBase server, which can be used locally for testing and debugging GeoWave, without needing to stand up an entire cluster.

OPTIONS

There are currently no options for this command

Landsat8 Commands

Operations to analyze, download, and ingest Landsat 8 imagery publicly available on AWS (Required options are designated with an *)

Commands:

Analyze
Download
Ingest
IngestRaster
IngestVector

geowave landsat analyze

NAME

geowave landsat analyze - Print out basic aggregate statistics for available Landsat 8 imagery

SYNOPSIS

geowave landsat analyze [options]

DESCRIPTION

This command will print out basic aggregate statistics that are for available Landsat 8 imagery

OPTIONS

--cql
- An optional CQL expression to filter the ingested imagery. The feature type for the expression has the following attributes: shape (Geometry), acquisitionDate (Date), cloudCover (double), processingLevel (String), path (int), row (int) and the feature ID is entityId for the scene. Additionally attributes of the individuals band can be used such as band (String), sizeMB (double), and bandDownloadUrl (String)
- Default: <empty string>
--nbestbands
- An option to identify and only use a set number of bands with the best cloud cover
- Default: 0
--nbestperspatial
- A boolean flag, when applied with --nbestscenes or --nbestbands will aggregate scenes and/or bands by path/row
- Default: false
--nbestscenes
- An option to identify and only use a set number of scenes with the best cloud cover
- Default: 0
--sincelastrun
- An option to check the scenes list from the workspace and if it exists, to only ingest data since the last scene.
- Default: false
--usecachedscenes
- An option to run against the existing scenes catalog in the workspace directory if it exists.
- Default: false
-ws, --workspaceDir
- A local directory to write temporary files needed for landsat 8 ingest.
- Default is <TEMP_DIR>/landsat8
- Default: landsat8

geowave landsat download

NAME

geowave landsat download - Download Landsat 8 imagery to a local directory

SYNOPSIS

geowave landsat download [options]

DESCRIPTION

This command will download the Landsat 8 imagery to a local directory

OPTIONS

--cql
- An optional CQL expression to filter the ingested imagery. The feature type for the expression has the following attributes: shape (Geometry), acquisitionDate (Date), cloudCover (double), processingLevel (String), path (int), row (int) and the feature ID is entityId for the scene. Additionally attributes of the individuals band can be used such as band (String), sizeMB (double), and bandDownloadUrl (String)
- Default: <empty string>
--nbestbands
- An option to identify and only use a set number of bands with the best cloud cover
- Default: 0
--nbestperspatial
- A boolean flag, when applied with --nbestscenes or --nbestbands will aggregate scenes and/or bands by path/row
- Default: false
--nbestscenes
- An option to identify and only use a set number of scenes with the best cloud cover
- Default: 0
--sincelastrun
- An option to check the scenes list from the workspace and if it exists, to only ingest data since the last scene.
- Default: false
--usecachedscenes
- An option to run against the existing scenes catalog in the workspace directory if it exists.
- Default: false
-ws, --workspaceDir
- A local directory to write temporary files needed for landsat 8 ingest.
- Default is <TEMP_DIR>/landsat8
- Default: landsat8

geowave landsat ingest

NAME

geowave landsat ingest - Ingest routine for locally downloading Landsat 8 imagery and ingesting it into GeoWave’s raster store and in parallel ingesting the scene metadata into GeoWave’s vector store. These two stores can actually be the same or they can be different.

SYNOPSIS

geowave landsat ingest [options] <storename> <comma delimited index/group list>

DESCRIPTION

This command provides a standard ingest routine for downloading Landsat 8 imagery and ingesting it into GeoWave’s raster store. In parallel, it allows for ingesting the scene metadata into GeoWave’s vector store.

OPTIONS

--converter
- Prior to ingesting an image, this converter will be used to massage the data. The default is not to convert the data.
--coverage
- The name to give to each unique coverage. Freemarker templating can be used for variable substition based on the same attributes used for filtering. The default coverage name is '${entityId}_${band}'. If ${band} is unused in the coverage name, all bands will be merged together into the same coverage.
- Default: ${entityId}_${band}
--cql
- An optional CQL expression to filter the ingested imagery. The feature type for the expression has the following attributes: shape (Geometry), acquisitionDate (Date), cloudCover (double), processingLevel (String), path (int), row (int) and the feature ID is entityId for the scene. Additionally attributes of the individuals band can be used such as band (String), sizeMB (double), and bandDownloadUrl (String)
- Default: <empty string>
--crop
- Use the spatial constraint provided in CQL to crop the image. If no spatial constraint is provided, this will not have an effect.
- Default: false
--histogram
- An option to store the histogram of the values of the coverage so that histogram equalization will be performed
- Default: false
--nbestbands
- An option to identify and only use a set number of bands with the best cloud cover
- Default: 0
--nbestperspatial
- A boolean flag, when applied with --nbestscenes or --nbestbands will aggregate scenes and/or bands by path/row
- Default: false
--nbestscenes
- An option to identify and only use a set number of scenes with the best cloud cover
- Default: 0
--overwrite
- An option to overwrite images that are ingested in the local workspace directory. By default it will keep an existing image rather than downloading it again.
- Default: false
--pyramid
- An option to store an image pyramid for the coverage
- Default: false
--retainimages
- An option to keep the images that are ingested in the local workspace directory. By default it will delete the local file after it is ingested successfully.
- Default: false
--sincelastrun
- An option to check the scenes list from the workspace and if it exists, to only ingest data since the last scene.
- Default: false
--skipMerge
- By default the ingest will automerge overlapping tiles as a post-processing optimization step for efficient retrieval, but this will skip the merge process
- Default: false
--subsample
- Subsample the image prior to ingest by the scale factor provided. The scale factor should be an integer value greater than 1.
- Default: 1
--tilesize
- The option to set the pixel size for each tile stored in GeoWave.
- Default: 512
--usecachedscenes
- An option to run against the existing scenes catalog in the workspace directory if it exists.
- Default: false
--vectorindex
- By ingesting as both vectors and rasters you may want each indexed differently. This will override the index used for vector output.
--vectorstore
- By ingesting as both vectors and rasters you may want to ingest into different stores. This will override the store for vector output.
-ws, --workspaceDir
- A local directory to write temporary files needed for landsat 8 ingest. Default is <TEMP_DIR>/landsat8
- Default: landsat8

geowave landsat ingestraster

NAME

geowave landsat ingestraster - Ingest routine for locally downloading Landsat 8 imagery and ingesting it into GeoWave

SYNOPSIS

geowave landsat analyze [options]

DESCRIPTION

This command provides an ingest routine for locally downloading Landsat 8 imagery and ingesting it into GeoWave.

OPTIONS

--converter
- Prior to ingesting an image, this converter will be used to massage the data. The default is not to convert the data.
--coverage
- The name to give to each unique coverage. Freemarker templating can be used for variable substition based on the same attributes used for filtering. The default coverage name is '${entityId}_${band}'. If ${band} is unused in the coverage name, all bands will be merged together into the same coverage.
- Default: ${entityId}_${band}
--cql
- An optional CQL expression to filter the ingested imagery. The feature type for the expression has the following attributes: shape (Geometry), acquisitionDate (Date), cloudCover (double), processingLevel (String), path (int), row (int) and the feature ID is entityId for the scene. Additionally attributes of the individuals band can be used such as band (String), sizeMB (double), and bandDownloadUrl (String)
- Default: <empty string>
--crop
- Use the spatial constraint provided in CQL to crop the image. If no spatial constraint is provided, this will not have an effect.
- Default: false
--histogram
- An option to store the histogram of the values of the coverage so that histogram equalization will be performed
- Default: false
--nbestbands
- An option to identify and only use a set number of bands with the best cloud cover
- Default: 0
--nbestperspatial
- A boolean flag, when applied with --nbestscenes or --nbestbands will aggregate scenes and/or bands by path/row
- Default: false
--nbestscenes
- An option to identify and only use a set number of scenes with the best cloud cover
- Default: 0
--overwrite
- An option to overwrite images that are ingested in the local workspace directory. By default it will keep an existing image rather than downloading it again.
- Default: false
--pyramid
- An option to store an image pyramid for the coverage
- Default: false
--retainimages
- An option to keep the images that are ingested in the local workspace directory. By default it will delete the local file after it is ingested successfully.
- Default: false
--sincelastrun
- An option to check the scenes list from the workspace and if it exists, to only ingest data since the last scene.
- Default: false
--skipMerge
- By default the ingest will automerge overlapping tiles as a post-processing optimization step for efficient retrieval, but this will skip the merge process
- Default: false
--subsample
- Subsample the image prior to ingest by the scale factor provided. The scale factor should be an integer value greater than 1.
- Default: 1
--tilesize
- The option to set the pixel size for each tile stored in GeoWave. The default is 512
- Default: 512
--usecachedscenes
- An option to run against the existing scenes catalog in the workspace directory if it exists.
- Default: false
-ws, --workspaceDir
- A local directory to write temporary files needed for landsat 8 ingest. Default is <TEMP_DIR>/landsat8
- Default: landsat8

geowave landsat ingestvector

NAME

geowave landsat ingestvector - Ingest routine for searching landsat scenes that match certain criteria and ingesting the scene and band metadata into GeoWave’s vector store.

SYNOPSIS

geowave landsat ingestvector [options] <storename> <comma delimited index/group list>

DESCRIPTION

This command provides an ingest routine for searching landsat scenes that match certain criteria, and ingesting the scene and band metadata into GeoWave’s vector store.

OPTIONS

--cql
- An optional CQL expression to filter the ingested imagery. The feature type for the expression has the following attributes: shape (Geometry), acquisitionDate (Date), cloudCover (double), processingLevel (String), path (int), row (int) and the feature ID is entityId for the scene. Additionally attributes of the individuals band can be used such as band (String), sizeMB (double), and bandDownloadUrl (String)
- Default: <empty string>
--nbestbands
- An option to identify and only use a set number of bands with the best cloud cover
- Default: 0
--nbestperspatial
- A boolean flag, when applied with --nbestscenes or --nbestbands will aggregate scenes and/or bands by path/row
- Default: false
--nbestscenes
- An option to identify and only use a set number of scenes with the best cloud cover
- Default: 0
--sincelastrun
- An option to check the scenes list from the workspace and if it exists, to only ingest data since the last scene.
- Default: false
--usecachedscenes
- An option to run against the existing scenes catalog in the workspace directory if it exists.
- Default: false
-ws, --workspaceDir
- A local directory to write temporary files needed for landsat 8 ingest. Default is <TEMP_DIR>/landsat8
- Default: landsat8

OSM Commands

Operations to ingest Open Street Map (OSM) nodes, ways and relations to GeoWave (Required options are designated with an *)

Commands:

Ingest
Stage

geowave osm ingest

NAME

geowave osm ingest - Ingest and convert OSM data from HDFS to GeoWave

SYNOPSIS

geowave osm ingest [options] <hdfs host:port> <path to base directory to read from> <store name>

DESCRIPTION

This command will ingest and convert OSM data from HDFS to GeoWave.

OPTIONS

-jn, --jobName
- Name of mapreduce job
- Default: Ingest (mcarrier)
-m, --mappingFile
- Mapping file, imposm3 form
--table
- OSM Table name in GeoWave
- Default: OSM
* -t, --type
- Mapper type - one of node, way, or relation
-v, --visibility
- The visibility of the data ingested (optional; default is 'public')

geowave osm stage

NAME

geowave osm stage - Stage OSM data to HDFS

SYNOPSIS

geowave osm stage [options] <file or directory> <hdfs host:port> <path to base directory to write to>

DESCRIPTION

This command will stage OSM data from a local directory and write it to HDFS

OPTIONS

--extension
- PBF File extension
- Default: .pbf

Raster Commands

Operations to perform transformations on raster data in GeoWave (Required options are designated with an *)

Commands:

ReSize

geowave raster resize

NAME

geowave raster resize - Resize Raster Tiles

SYNOPSIS

geowave raster resize [options] <input store name> <output store name>

DESCRIPTION

This command will resize raster tiles that are stored in a GeoWave datastore, and write the resized tiles to a new output store.

OPTIONS

* --hdfsHostPort
- the hdfs host port
--indexId
- The index that the input raster is stored in
* --inputCoverageName
- The name of the input raster coverage
* --jobSubmissionHostPort
- The job submission tracker
--maxSplits
- The max partitions for the input data
--minSplits
- The min partitions for the input data
* --outputCoverageName
- The out output raster coverage name
* --outputTileSize
- The tile size to output

Vector Commands

Vector data operations (Required options are designated with an *)

Commands:

CQLDelete
LocalExport
MRExport

geowave vector cqldelete

NAME

geowave vector cqldelete - Delete data that matches a CQL filter

SYNOPSIS

geowave vector cqldelete [options] <storename>

DESCRIPTION

This command will delete all data in a data store that matches a CQL filter.

OPTIONS

--adapterId
- Optional ability to provide an adapter ID
* --cql
- CQL Filter for delete
--debug
- Print out additional info for debug purposes
- Default: false
--indexId
- The name of the index (optional)

geowave vector localexport

NAME

geowave vector localexport - Export data directly

SYNOPSIS

geowave vector localexport [options] <store name>

DESCRIPTION

This command will export data from a data store

OPTIONS

--adapterIds
- Comma separated list of adapter Ids
--batchSize
- Records to process at a time
- Default: 10000
--cqlFilter
- Filter exported data based on CQL filter
--indexId
- The index to export from
* --outputFile

geowave vector mrexport

NAME

geowave vector mrexport - Export data using map-reduce

SYNOPSIS

geowave vector mrexport [options] <hdfs host:port> <path to base directory to write to> <store name>

DESCRIPTION

This command will perform a data export for data in a data store, and will use MapReduce to support high-volume data stores.

OPTIONS

--adapterIds
- Comma separated list of adapter Ids
--batchSize
- Records to process at a time
- Default: 10000
--cqlFilter
- Filter exported data based on CQL filter
--indexId
- The index to export from
--maxSplits
- The max partitions for the input data
--minSplits
- The min partitions for the input data
--resourceManagerHostPort

GeoWave Command Line Controls

Links

Site

PDF

Javadoc

GitHub

Packages

Commands

Helpful Commands & Flags

Debug Flag (--debug)

Version Flag (--version)

Help Command

Explain Command

Config Commands

geowave config addindex

geowave config addindexgrp

geowave config addstore

geowave config aws

geowave config cpindex

geowave config cpstore

geowave config geoserver

geowave config hdfs

geowave config list

geowave config newcryptokey

geowave config rmindex

geowave config rmindexgrp

geowave config rmstore

geowave config set

Ingest Commands

geowave ingest kafkaToGW

geowave ingest listplugins

geowave ingest localToGW

geowave ingest localToHdfs

geowave ingest localToKafka

geowave ingest localToMrGW

geowave ingest mrToGW

geowave ingest sparkToGW

Remote Commands

geowave remote calcstat

geowave remote clear

geowave remote copy

geowave remote listadapter

geowave remote listindex

geowave remote liststats

geowave remote mergedata

geowave remote recalcstats

geowave remote rmstat

geowave remote version

Analytic Commands

geowave analytic dbscan

geowave analytic kde

geowave analytic kmeansjump

geowave analytic kmeansparallel

geowave analytic nn

geowave analytic sql

Accumulo Commands

geowave accumulo dbscan

geowave accumulo runserver

geowave accumulo splitequalinterval

geowave accumulo splitnumrecords

geowave accumulo splitquantile

GeoServer Commands

geowave gs addcs

geowave gs addcvs

geowave gs addds

geowave gs addfl

geowave gs addlayer

geowave gs addstyle

geowave gs addws

geowave gs getcs

geowave gs getcv

geowave gs getds

geowave gs getfl

geowave gs getsa

geowave gs getstyle

geowave gs listcs

geowave gs listcv

geowave gs listds

geowave gs listfl

geowave gs liststyles