Harvester Admin API (v0.1)

postHarvestable

Create harvest job configuration

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/json
required

One of

id

string^[0-9]*$

Unique, numeric ID for the job definition. Will be assigned if not provided.

name

required

string

The name assigned to the harvest configuration.

description

string

Free form description of the configuration to support the administration.

openAccess

string

Enum: "true" "false"

tbd

required

object or object

Reference to the storage configuration to use.

Any of

id

required

string

Reference to the ID of the storage engine to use.

required

object or object

Reference to the transformation pipeline to use.

Any of

id

required

string^[0-9]*$

Reference to the ID of the transformation pipeline to apply.

enabled

required

string

Enum: "true" "false"

Indicates if the job is scheduled for running

harvestImmediately

required

string

Enum: "true" "false"

Whether to harvest when the config is persisted.

scheduleString

string

Crontab style schedule string (simplified): minute(0-59) hour(0-24) day of month(* or 1-31) month (* or 1-12) day of week (* or 0-6).

dateFormat

string

For example yyyy-MM-dd'T'hh:mm:ss'Z'.

url

required

string

The URL to access the data from.

timeout

string^[0-9]*$

Default: "300"

Connection/read timeout in seconds; application depending on the specific protocol used for fetching data.

cacheEnabled

string

Enum: "true" "false"

Whether or not to store incoming records in Harvester's file system.

diskRun

string

Enum: "true" "false"

Whether or not to run harvest job from records cached in a previous job run.

storageBatchLimit

string^[0-9]*$

Batch size: Number of records to send to storage at a time.

recordLimit

string^[0-9]*$

Maximum number of records to harvest.

laxParsing

string

Default: "false"

Enum: "true" "false"

When enabled, Harvester will attempt to parse malformed XML (missing closing tags, entities)

constantFields

string

Values related to target handling in MasterKey. Otherwise obsolete.

storeOriginal

string

Default: "false"

Enum: "true" "false"

Indicates whether to store incoming original record, if supported by the job type and the storage configuration.

managedBy

string

Free-text field for tagging a job with the producer or manager of the resource. Multiple tags may be separated by commas. The tags can be used for filtering status reports by job administrators for example.

usedBy

string

Free form administrative information; could be tags for the clients using this harvestable.

serviceProvider

string

Free-text field for administrative information about the harvest job.

contactNotes

string

Free form text field for administrator's notes.

technicalNotes

string

Free-text field for administrative information.

logLevel

string

Enum: "ERROR" "WARN" "INFO" "DEBUG" "TRACE"

Specifies the logging level for the job with TRACE being the most (extremely) verbose. INFO is the recommended log level in most cases.

failedRecordsLogging

string

Default: "CLEAN_DIRECTORY"

Enum: "NO_STORE" "CLEAN_DIRECTORY" "CREATE_OVERWRITE" "ADD_ALL"

Specify whether or not failed records should be saved as XML files in a designated log directory. Also specifies retention policy for the directory, that is, whether to retain files that were saved in previous runs (CLEAN_DIRECTORY = don't retain) and, if so, whether to overwrite any existing files if the same record fails again (CREATE_OVERWRITE) or rather add a sequence number to the new file name in order not to overwrite (ADD_ALL).

maxSavedFailedRecordsPerRun

string^[0-9]*$

Default: "100"

Sets a maximum number of files to save in the failed records directory per run. The job log will tell when the limit is reached.

maxSavedFailedRecordsTotal

string^[0-9]*$

Default: "1000"

Sets a maximum number of files to be saved in the failed records directory at any given time - as the sum of previously saved records (that were not cleaned up before this run) plus any new records added during the run.The job log will tell when the limit is reached.

mailAddress

string

Comma separated list of e-mail addresses that should receive notification on job completion.

mailLevel

string

Enum: "OK" "WARN" "ERROR"

The minimum severity of a job's completion status that will trigger email notification.

initiallyHarvested

string

Date and time, assigned by Harvester

lastHarvestFinished

string

Assigned by Harvester. The date and time when the most recent harvest job with this configuration completed.

lastHarvestStarted

string

Assigned by Harvester. The date and time when the most recent harvest job with this configuration began.

lastUpdated

string

Assigned by API. The date and time when this definition was last modified.

nextHarvestSchedule

string

The date and time when a job with this definition should be run (if job is enabled).

amountHarvested

string^[0-9]*$

Assigned by API. Number of records harvested in last run. It seems this should really be an integer, but string is what the WSAPI gives us.

message

string

Assigned by API. Message summarising results of last run.

type

required

string

Value: "xmlBulk"

Indicates bulk XML job.

retryCount

string^[0-9]*$

Default: "2"

Obsolete but allowed for XML bulk.

retryWait

string^[0-9]*$

Default: "60"

Obsolete but allowed for XML bulk.

allowErrors

string

Default: "false"

Enum: "true" "false"

Whether or not to continue despite harvest record errors.

allowCondReq

string

Default: "false"

Enum: "true" "false"

Whether or not to filter on file date to only harvest new XML files

fromDate

string

Initial start date (yyyy-MM-dd) for incremental updates (when allowCondReq is 'true')

csvConfiguration

string

Semicolon-separated key-value pairs that specifies parsing of a CSV file into XML for further processing (see Harvester documentation for details).

excludeFilePattern

string

Regular expression; setting to skip harvesting of files with names matching the given regular expression (see Harvester documentation for details).

expectedSchema

string

Mime-type override (e.g: application/marc; charset=MARC-8).

includeFilePattern

string

Regular expression; setting to request harvesting of files with names matching the given regular expression unless those file names are simultaneously excluded by the excludeFilePattern. .zip, .gz, .tar included by default unless explicitly excluded by excludeFilePattern (see Harvester documentation for details).

outputSchema

string

MARC XML transformation format (application/marc or application/tmarc).

passiveMode

string

Default: "false"

Enum: "true" "false"

Whether or not to use passive mode for FTP transfers.

recurse

string

Default: "false"

Enum: "true" "false"

Whether or not to recurse into sub-folders in the source directory tree.

splitAt

string^[0-9]*$

Level/depth to split XML files at to extract records. Zero/empty disables split.

splitSize

string^[0-9]*$

Setting to split large XML files into chunks of `splitSize' number of records; to preserve memory during XSLT transformations.

json

object

Custom configurations in JSON format (has no current applications).

overwrite

string

Enum: "true" "false"

Applies to Solr but not FOLIO Inventory. Will delete all previously harvested data before beginning the next scheduled (or manually triggered) run, if set to true.

keepPartial

string

Enum: "true" "false"

Applies to Solr but not FOLIO Inventory. When true, partial records harvested during a failed harvest run will be retained in Solr.

Responses

Request samples

Payload

Content type

application/json

Example

harvestableXmlBulkPostPut

{"id": "string",
"name": "string",
"description": "string",
"openAccess": "true",
"storage": {"id": "string"
},
"transformation": {"id": "string"
},
"enabled": "true",
"harvestImmediately": "true",
"scheduleString": "string",
"dateFormat": "string",
"url": "string",
"timeout": "300",
"cacheEnabled": "true",
"diskRun": "true",
"storageBatchLimit": "string",
"recordLimit": "string",
"laxParsing": "true",
"constantFields": "string",
"storeOriginal": "true",
"managedBy": "string",
"usedBy": "string",
"serviceProvider": "string",
"contactNotes": "string",
"technicalNotes": "string",
"logLevel": "ERROR",
"failedRecordsLogging": "NO_STORE",
"maxSavedFailedRecordsPerRun": "100",
"maxSavedFailedRecordsTotal": "1000",
"mailAddress": "string",
"mailLevel": "OK",
"initiallyHarvested": "string",
"lastHarvestFinished": "string",
"lastHarvestStarted": "string",
"lastUpdated": "string",
"nextHarvestSchedule": "string",
"amountHarvested": "string",
"message": "string",
"type": "xmlBulk",
"retryCount": "2",
"retryWait": "60",
"allowErrors": "true",
"allowCondReq": "true",
"fromDate": "string",
"csvConfiguration": "string",
"excludeFilePattern": "string",
"expectedSchema": "string",
"includeFilePattern": "string",
"outputSchema": "string",
"passiveMode": "true",
"recurse": "true",
"splitAt": "string",
"splitSize": "string",
"json": { },
"overwrite": "true",
"keepPartial": "true"
}

Response samples

201
400

Content type

application/json

{"type": "xmlBulk",
"allowErrors": "true",
"overwrite": "true",
"allowCondReq": "true",
"fromDate": "string",
"csvConfiguration": "string",
"excludeFilePattern": "string",
"expectedSchema": "string",
"includeFilePattern": "string",
"outputSchema": "string",
"passiveMode": "true",
"recurse": "true",
"splitAt": "string",
"splitSize": "string",
"id": "string",
"name": "string",
"description": "string",
"openAccess": "true",
"storage": {"entityType": "inventoryStorageEntity",
"bulkSize": "string",
"currentStatus": "string",
"customClass": "string",
"enabled": "true",
"idAsString": "string",
"name": "string",
"url": "string"
},
"transformation": {"entityType": "basicTransformation",
"acl": "string",
"description": "string",
"enabled": "true",
"name": "string",
"parallel": "true",
"stepAssociations": [{"id": "string",
"position": "string",
"step": {"entityType": "xmlTransformationStep",
"acl": "string",
"description": "string",
"inputFormat": "string",
"name": "string",
"outputFormat": "string",
"script": "<'script' omitted from nested displays>",
"id": "string",
"testData": "<'testData' omitted from nested displays>",
"testOutput": "<'testOutput' omitted from nested displays>"
},
"transformation": "string"
}
],
"id": "string"
},
"enabled": "true",
"harvestImmediately": "true",
"scheduleString": "string",
"dateFormat": "string",
"url": "string",
"timeout": "300",
"cacheEnabled": "true",
"diskRun": "true",
"recordLimit": "string",
"laxParsing": "true",
"constantFields": "string",
"storeOriginal": "true",
"currentStatus": "NEW",
"managedBy": "string",
"usedBy": "string",
"serviceProvider": "string",
"contactNotes": "string",
"technicalNotes": "string",
"logLevel": "ERROR",
"failedRecordsLogging": "NO_STORE",
"maxSavedFailedRecordsPerRun": "100",
"maxSavedFailedRecordsTotal": "1000",
"mailAddress": "string",
"mailLevel": "OK",
"lastHarvestFinished": "string",
"lastHarvestStarted": "string",
"lastUpdated": "string",
"nextHarvestSchedule": "string",
"amountHarvested": "string",
"message": "string",
"acl": "string"
}

getHarvestables

Get brief harvest job definitions

query Parameters

query

string

CQL

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"titles": [{"id": "string",
"name": "string",
"currentStatus": "NEW",
"enabled": "true",
"storageUrl": "string",
"lastHarvestFinished": "string",
"lastHarvestStarted": "string",
"lastUpdated": "string",
"nextHarvestSchedule": "string",
"jobClass": "XmlBulkResource",
"amountHarvested": "string",
"message": "string"
}
]
}

postHarvestableXmlBulk

Create bulk XML harvest job configuration

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/json
required

id

string^[0-9]*$

Unique, numeric ID for the job definition. Will be assigned if not provided.

name

required

string

The name assigned to the harvest configuration.

description

string

Free form description of the configuration to support the administration.

openAccess

string

Enum: "true" "false"

tbd

required

object or object

Reference to the storage configuration to use.

Any of

id

required

string

Reference to the ID of the storage engine to use.

required

object or object

Reference to the transformation pipeline to use.

Any of

id

required

string^[0-9]*$

Reference to the ID of the transformation pipeline to apply.

enabled

required

string

Enum: "true" "false"

Indicates if the job is scheduled for running

harvestImmediately

required

string

Enum: "true" "false"

Whether to harvest when the config is persisted.

scheduleString

string

Crontab style schedule string (simplified): minute(0-59) hour(0-24) day of month(* or 1-31) month (* or 1-12) day of week (* or 0-6).

dateFormat

string

For example yyyy-MM-dd'T'hh:mm:ss'Z'.

url

required

string

The URL to access the data from.

timeout

string^[0-9]*$

Default: "300"

Connection/read timeout in seconds; application depending on the specific protocol used for fetching data.

cacheEnabled

string

Enum: "true" "false"

Whether or not to store incoming records in Harvester's file system.

diskRun

string

Enum: "true" "false"

Whether or not to run harvest job from records cached in a previous job run.

storageBatchLimit

string^[0-9]*$

Batch size: Number of records to send to storage at a time.

recordLimit

string^[0-9]*$

Maximum number of records to harvest.

laxParsing

string

Default: "false"

Enum: "true" "false"

When enabled, Harvester will attempt to parse malformed XML (missing closing tags, entities)

constantFields

string

Values related to target handling in MasterKey. Otherwise obsolete.

storeOriginal

string

Default: "false"

Enum: "true" "false"

Indicates whether to store incoming original record, if supported by the job type and the storage configuration.

managedBy

string

Free-text field for tagging a job with the producer or manager of the resource. Multiple tags may be separated by commas. The tags can be used for filtering status reports by job administrators for example.

usedBy

string

Free form administrative information; could be tags for the clients using this harvestable.

serviceProvider

string

Free-text field for administrative information about the harvest job.

contactNotes

string

Free form text field for administrator's notes.

technicalNotes

string

Free-text field for administrative information.

logLevel

string

Enum: "ERROR" "WARN" "INFO" "DEBUG" "TRACE"

Specifies the logging level for the job with TRACE being the most (extremely) verbose. INFO is the recommended log level in most cases.

failedRecordsLogging

string

Default: "CLEAN_DIRECTORY"

Enum: "NO_STORE" "CLEAN_DIRECTORY" "CREATE_OVERWRITE" "ADD_ALL"

Specify whether or not failed records should be saved as XML files in a designated log directory. Also specifies retention policy for the directory, that is, whether to retain files that were saved in previous runs (CLEAN_DIRECTORY = don't retain) and, if so, whether to overwrite any existing files if the same record fails again (CREATE_OVERWRITE) or rather add a sequence number to the new file name in order not to overwrite (ADD_ALL).

maxSavedFailedRecordsPerRun

string^[0-9]*$

Default: "100"

Sets a maximum number of files to save in the failed records directory per run. The job log will tell when the limit is reached.

maxSavedFailedRecordsTotal

string^[0-9]*$

Default: "1000"

Sets a maximum number of files to be saved in the failed records directory at any given time - as the sum of previously saved records (that were not cleaned up before this run) plus any new records added during the run.The job log will tell when the limit is reached.

mailAddress

string

Comma separated list of e-mail addresses that should receive notification on job completion.

mailLevel

string

Enum: "OK" "WARN" "ERROR"

The minimum severity of a job's completion status that will trigger email notification.

initiallyHarvested

string

Date and time, assigned by Harvester

lastHarvestFinished

string

Assigned by Harvester. The date and time when the most recent harvest job with this configuration completed.

lastHarvestStarted

string

Assigned by Harvester. The date and time when the most recent harvest job with this configuration began.

lastUpdated

string

Assigned by API. The date and time when this definition was last modified.

nextHarvestSchedule

string

The date and time when a job with this definition should be run (if job is enabled).

amountHarvested

string^[0-9]*$

Assigned by API. Number of records harvested in last run. It seems this should really be an integer, but string is what the WSAPI gives us.

message

string

Assigned by API. Message summarising results of last run.

type

required

string

Value: "xmlBulk"

Indicates bulk XML job.

retryCount

string^[0-9]*$

Default: "2"

Obsolete but allowed for XML bulk.

retryWait

string^[0-9]*$

Default: "60"

Obsolete but allowed for XML bulk.

allowErrors

string

Default: "false"

Enum: "true" "false"

Whether or not to continue despite harvest record errors.

allowCondReq

string

Default: "false"

Enum: "true" "false"

Whether or not to filter on file date to only harvest new XML files

fromDate

string

Initial start date (yyyy-MM-dd) for incremental updates (when allowCondReq is 'true')

csvConfiguration

string

Semicolon-separated key-value pairs that specifies parsing of a CSV file into XML for further processing (see Harvester documentation for details).

excludeFilePattern

string

Regular expression; setting to skip harvesting of files with names matching the given regular expression (see Harvester documentation for details).

expectedSchema

string

Mime-type override (e.g: application/marc; charset=MARC-8).

includeFilePattern

string

Regular expression; setting to request harvesting of files with names matching the given regular expression unless those file names are simultaneously excluded by the excludeFilePattern. .zip, .gz, .tar included by default unless explicitly excluded by excludeFilePattern (see Harvester documentation for details).

outputSchema

string

MARC XML transformation format (application/marc or application/tmarc).

passiveMode

string

Default: "false"

Enum: "true" "false"

Whether or not to use passive mode for FTP transfers.

recurse

string

Default: "false"

Enum: "true" "false"

Whether or not to recurse into sub-folders in the source directory tree.

splitAt

string^[0-9]*$

Level/depth to split XML files at to extract records. Zero/empty disables split.

splitSize

string^[0-9]*$

Setting to split large XML files into chunks of `splitSize' number of records; to preserve memory during XSLT transformations.

json

object

Custom configurations in JSON format (has no current applications).

overwrite

string

Enum: "true" "false"

Applies to Solr but not FOLIO Inventory. Will delete all previously harvested data before beginning the next scheduled (or manually triggered) run, if set to true.

keepPartial

string

Enum: "true" "false"

Applies to Solr but not FOLIO Inventory. When true, partial records harvested during a failed harvest run will be retained in Solr.

Responses

Request samples

Payload

Content type

application/json

{"id": "string",
"name": "string",
"description": "string",
"openAccess": "true",
"storage": {"id": "string"
},
"transformation": {"id": "string"
},
"enabled": "true",
"harvestImmediately": "true",
"scheduleString": "string",
"dateFormat": "string",
"url": "string",
"timeout": "300",
"cacheEnabled": "true",
"diskRun": "true",
"storageBatchLimit": "string",
"recordLimit": "string",
"laxParsing": "true",
"constantFields": "string",
"storeOriginal": "true",
"managedBy": "string",
"usedBy": "string",
"serviceProvider": "string",
"contactNotes": "string",
"technicalNotes": "string",
"logLevel": "ERROR",
"failedRecordsLogging": "NO_STORE",
"maxSavedFailedRecordsPerRun": "100",
"maxSavedFailedRecordsTotal": "1000",
"mailAddress": "string",
"mailLevel": "OK",
"initiallyHarvested": "string",
"lastHarvestFinished": "string",
"lastHarvestStarted": "string",
"lastUpdated": "string",
"nextHarvestSchedule": "string",
"amountHarvested": "string",
"message": "string",
"type": "xmlBulk",
"retryCount": "2",
"retryWait": "60",
"allowErrors": "true",
"allowCondReq": "true",
"fromDate": "string",
"csvConfiguration": "string",
"excludeFilePattern": "string",
"expectedSchema": "string",
"includeFilePattern": "string",
"outputSchema": "string",
"passiveMode": "true",
"recurse": "true",
"splitAt": "string",
"splitSize": "string",
"json": { },
"overwrite": "true",
"keepPartial": "true"
}

Response samples

201
400

Content type

application/json

{"type": "xmlBulk",
"allowErrors": "true",
"overwrite": "true",
"allowCondReq": "true",
"fromDate": "string",
"csvConfiguration": "string",
"excludeFilePattern": "string",
"expectedSchema": "string",
"includeFilePattern": "string",
"outputSchema": "string",
"passiveMode": "true",
"recurse": "true",
"splitAt": "string",
"splitSize": "string",
"id": "string",
"name": "string",
"description": "string",
"openAccess": "true",
"storage": {"entityType": "inventoryStorageEntity",
"bulkSize": "string",
"currentStatus": "string",
"customClass": "string",
"enabled": "true",
"idAsString": "string",
"name": "string",
"url": "string"
},
"transformation": {"entityType": "basicTransformation",
"acl": "string",
"description": "string",
"enabled": "true",
"name": "string",
"parallel": "true",
"stepAssociations": [{"id": "string",
"position": "string",
"step": {"entityType": "xmlTransformationStep",
"acl": "string",
"description": "string",
"inputFormat": "string",
"name": "string",
"outputFormat": "string",
"script": "<'script' omitted from nested displays>",
"id": "string",
"testData": "<'testData' omitted from nested displays>",
"testOutput": "<'testOutput' omitted from nested displays>"
},
"transformation": "string"
}
],
"id": "string"
},
"enabled": "true",
"harvestImmediately": "true",
"scheduleString": "string",
"dateFormat": "string",
"url": "string",
"timeout": "300",
"cacheEnabled": "true",
"diskRun": "true",
"recordLimit": "string",
"laxParsing": "true",
"constantFields": "string",
"storeOriginal": "true",
"currentStatus": "NEW",
"managedBy": "string",
"usedBy": "string",
"serviceProvider": "string",
"contactNotes": "string",
"technicalNotes": "string",
"logLevel": "ERROR",
"failedRecordsLogging": "NO_STORE",
"maxSavedFailedRecordsPerRun": "100",
"maxSavedFailedRecordsTotal": "1000",
"mailAddress": "string",
"mailLevel": "OK",
"lastHarvestFinished": "string",
"lastHarvestStarted": "string",
"lastUpdated": "string",
"nextHarvestSchedule": "string",
"amountHarvested": "string",
"message": "string",
"acl": "string"
}

postHarvestableOaiPmh

Create OAI-PMH harvest job configuration

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/json
required

id

string^[0-9]*$

Unique, numeric ID for the job definition. Will be assigned if not provided.

name

required

string

The name assigned to the harvest configuration.

description

string

Free form description of the configuration to support the administration.

openAccess

string

Enum: "true" "false"

tbd

required

object or object

Reference to the storage configuration to use.

Any of

id

required

string

Reference to the ID of the storage engine to use.

required

object or object

Reference to the transformation pipeline to use.

Any of

id

required

string^[0-9]*$

Reference to the ID of the transformation pipeline to apply.

enabled

required

string

Enum: "true" "false"

Indicates if the job is scheduled for running

harvestImmediately

required

string

Enum: "true" "false"

Whether to harvest when the config is persisted.

scheduleString

string

Crontab style schedule string (simplified): minute(0-59) hour(0-24) day of month(* or 1-31) month (* or 1-12) day of week (* or 0-6).

dateFormat

string

For example yyyy-MM-dd'T'hh:mm:ss'Z'.

url

required

string

The URL to access the data from.

timeout

string^[0-9]*$

Default: "300"

Connection/read timeout in seconds; application depending on the specific protocol used for fetching data.

cacheEnabled

string

Enum: "true" "false"

Whether or not to store incoming records in Harvester's file system.

diskRun

string

Enum: "true" "false"

Whether or not to run harvest job from records cached in a previous job run.

storageBatchLimit

string^[0-9]*$

Batch size: Number of records to send to storage at a time.

recordLimit

string^[0-9]*$

Maximum number of records to harvest.

laxParsing

string

Default: "false"

Enum: "true" "false"

When enabled, Harvester will attempt to parse malformed XML (missing closing tags, entities)

constantFields

string

Values related to target handling in MasterKey. Otherwise obsolete.

storeOriginal

string

Default: "false"

Enum: "true" "false"

Indicates whether to store incoming original record, if supported by the job type and the storage configuration.

managedBy

string

Free-text field for tagging a job with the producer or manager of the resource. Multiple tags may be separated by commas. The tags can be used for filtering status reports by job administrators for example.

usedBy

string

Free form administrative information; could be tags for the clients using this harvestable.

serviceProvider

string

Free-text field for administrative information about the harvest job.

contactNotes

string

Free form text field for administrator's notes.

technicalNotes

string

Free-text field for administrative information.

logLevel

string

Enum: "ERROR" "WARN" "INFO" "DEBUG" "TRACE"

Specifies the logging level for the job with TRACE being the most (extremely) verbose. INFO is the recommended log level in most cases.

failedRecordsLogging

string

Default: "CLEAN_DIRECTORY"

Enum: "NO_STORE" "CLEAN_DIRECTORY" "CREATE_OVERWRITE" "ADD_ALL"

Specify whether or not failed records should be saved as XML files in a designated log directory. Also specifies retention policy for the directory, that is, whether to retain files that were saved in previous runs (CLEAN_DIRECTORY = don't retain) and, if so, whether to overwrite any existing files if the same record fails again (CREATE_OVERWRITE) or rather add a sequence number to the new file name in order not to overwrite (ADD_ALL).

maxSavedFailedRecordsPerRun

string^[0-9]*$

Default: "100"

Sets a maximum number of files to save in the failed records directory per run. The job log will tell when the limit is reached.

maxSavedFailedRecordsTotal

string^[0-9]*$

Default: "1000"

Sets a maximum number of files to be saved in the failed records directory at any given time - as the sum of previously saved records (that were not cleaned up before this run) plus any new records added during the run.The job log will tell when the limit is reached.

mailAddress

string

Comma separated list of e-mail addresses that should receive notification on job completion.

mailLevel

string

Enum: "OK" "WARN" "ERROR"

The minimum severity of a job's completion status that will trigger email notification.

lastHarvestFinished

string

Assigned by API. The date and time when the most recent harvest job with this configuration completed.

initiallyHarvested

string

Date and time, assigned by Harvester

lastHarvestStarted

string

Assigned by Harvester. The date and time when the most recent harvest job with this configuration began.

lastUpdated

string

Assigned by Harvester. The date and time when this definition was last modified.

nextHarvestSchedule

string

The date and time when a job with this definition should be run (if job is enabled).

amountHarvested

string^[0-9]*$

Assigned by API. Number of records harvested in last run. It seems this should really be an integer, but string is what the WSAPI gives us.

message

string

Assigned by API. Message summarising results of last run.

type

required

string

Value: "oaiPmh"

Indicates OAI-PMH job.

metadataPrefix

required

string

OAI-PMH only. The metadata prefix supported by the OAI-PMH service to harvest from.

oaiSetName

required

string

OAI-PMH only. The name of a record set offered by the OAI-PMH service to harvest from.

resumptionToken

string

OAI-PMH only. PMH identifier for fetching the next batch of records.

clearRtOnError

string

Default: "false"

Enum: "true" "false"

Clear the resumption token for harvests that complete in an error state. This is useful when server errors out and the last resumption token is no longer valid.

fromDate

string

yyyy-MM-dd. If empty and no resumption token is set, the Harvester will harvest the full data set from the resource. When this field contains a value, upon completion of the job the Harvester will reset the value of this field to the day prior to the current run date, so subsequent runs will harvest only new records.

untilDate

string

yyyy-MM-dd. Upper date limit for selective harvesting. On consecutive runs the Harvester will clear this field making the date interval open-ended.

retryCount

string^[0-9]*$

Default: "2"

Indicates how many times Harvester should retry a failed OAI-PMH request.

retryWait

string^[0-9]*$

Default: "60"

Indicates how many seconds Harvester should wait before retrying a failed OAI-PMH request.

allowErrors

string

Default: "false"

Enum: "true" "false"

NA for OAI-PMH

json

object

Custom configurations in JSON format (has no current applications).

overwrite

string

Enum: "true" "false"

Applies to Solr but not FOLIO Inventory. Will delete all previously harvested data before beginning the next scheduled (or manually triggered) run, if set to true.

keepPartial

string

Enum: "true" "false"

Applies to Solr but not FOLIO Inventory. When true, partial records harvested during a failed harvest run will be retained in Solr.

Responses

Request samples

Payload

Content type

application/json

{"id": "string",
"name": "string",
"description": "string",
"openAccess": "true",
"storage": {"id": "string"
},
"transformation": {"id": "string"
},
"enabled": "true",
"harvestImmediately": "true",
"scheduleString": "string",
"dateFormat": "string",
"url": "string",
"timeout": "300",
"cacheEnabled": "true",
"diskRun": "true",
"storageBatchLimit": "string",
"recordLimit": "string",
"laxParsing": "true",
"constantFields": "string",
"storeOriginal": "true",
"managedBy": "string",
"usedBy": "string",
"serviceProvider": "string",
"contactNotes": "string",
"technicalNotes": "string",
"logLevel": "ERROR",
"failedRecordsLogging": "NO_STORE",
"maxSavedFailedRecordsPerRun": "100",
"maxSavedFailedRecordsTotal": "1000",
"mailAddress": "string",
"mailLevel": "OK",
"lastHarvestFinished": "string",
"initiallyHarvested": "string",
"lastHarvestStarted": "string",
"lastUpdated": "string",
"nextHarvestSchedule": "string",
"amountHarvested": "string",
"message": "string",
"type": "oaiPmh",
"metadataPrefix": "string",
"oaiSetName": "string",
"resumptionToken": "string",
"clearRtOnError": "true",
"fromDate": "string",
"untilDate": "string",
"retryCount": "2",
"retryWait": "60",
"allowErrors": "true",
"json": { },
"overwrite": "true",
"keepPartial": "true"
}

Response samples

201
400

Content type

application/json

{"type": "xmlBulk",
"allowErrors": "true",
"overwrite": "true",
"allowCondReq": "true",
"fromDate": "string",
"csvConfiguration": "string",
"excludeFilePattern": "string",
"expectedSchema": "string",
"includeFilePattern": "string",
"outputSchema": "string",
"passiveMode": "true",
"recurse": "true",
"splitAt": "string",
"splitSize": "string",
"id": "string",
"name": "string",
"description": "string",
"openAccess": "true",
"storage": {"entityType": "inventoryStorageEntity",
"bulkSize": "string",
"currentStatus": "string",
"customClass": "string",
"enabled": "true",
"idAsString": "string",
"name": "string",
"url": "string"
},
"transformation": {"entityType": "basicTransformation",
"acl": "string",
"description": "string",
"enabled": "true",
"name": "string",
"parallel": "true",
"stepAssociations": [{"id": "string",
"position": "string",
"step": {"entityType": "xmlTransformationStep",
"acl": "string",
"description": "string",
"inputFormat": "string",
"name": "string",
"outputFormat": "string",
"script": "<'script' omitted from nested displays>",
"id": "string",
"testData": "<'testData' omitted from nested displays>",
"testOutput": "<'testOutput' omitted from nested displays>"
},
"transformation": "string"
}
],
"id": "string"
},
"enabled": "true",
"harvestImmediately": "true",
"scheduleString": "string",
"dateFormat": "string",
"url": "string",
"timeout": "300",
"cacheEnabled": "true",
"diskRun": "true",
"recordLimit": "string",
"laxParsing": "true",
"constantFields": "string",
"storeOriginal": "true",
"currentStatus": "NEW",
"managedBy": "string",
"usedBy": "string",
"serviceProvider": "string",
"contactNotes": "string",
"technicalNotes": "string",
"logLevel": "ERROR",
"failedRecordsLogging": "NO_STORE",
"maxSavedFailedRecordsPerRun": "100",
"maxSavedFailedRecordsTotal": "1000",
"mailAddress": "string",
"mailLevel": "OK",
"lastHarvestFinished": "string",
"lastHarvestStarted": "string",
"lastUpdated": "string",
"nextHarvestSchedule": "string",
"amountHarvested": "string",
"message": "string",
"acl": "string"
}

getHarvestable

Get harvest configuration

path Parameters

id

required

number

Harvest configuration identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"type": "xmlBulk",
"allowErrors": "true",
"overwrite": "true",
"allowCondReq": "true",
"fromDate": "string",
"csvConfiguration": "string",
"excludeFilePattern": "string",
"expectedSchema": "string",
"includeFilePattern": "string",
"outputSchema": "string",
"passiveMode": "true",
"recurse": "true",
"splitAt": "string",
"splitSize": "string",
"id": "string",
"name": "string",
"description": "string",
"openAccess": "true",
"storage": {"entityType": "inventoryStorageEntity",
"bulkSize": "string",
"currentStatus": "string",
"customClass": "string",
"enabled": "true",
"idAsString": "string",
"name": "string",
"url": "string"
},
"transformation": {"entityType": "basicTransformation",
"acl": "string",
"description": "string",
"enabled": "true",
"name": "string",
"parallel": "true",
"stepAssociations": [{"id": "string",
"position": "string",
"step": {"entityType": "xmlTransformationStep",
"acl": "string",
"description": "string",
"inputFormat": "string",
"name": "string",
"outputFormat": "string",
"script": "<'script' omitted from nested displays>",
"id": "string",
"testData": "<'testData' omitted from nested displays>",
"testOutput": "<'testOutput' omitted from nested displays>"
},
"transformation": "string"
}
],
"id": "string"
},
"enabled": "true",
"harvestImmediately": "true",
"scheduleString": "string",
"dateFormat": "string",
"url": "string",
"timeout": "300",
"cacheEnabled": "true",
"diskRun": "true",
"recordLimit": "string",
"laxParsing": "true",
"constantFields": "string",
"storeOriginal": "true",
"currentStatus": "NEW",
"managedBy": "string",
"usedBy": "string",
"serviceProvider": "string",
"contactNotes": "string",
"technicalNotes": "string",
"logLevel": "ERROR",
"failedRecordsLogging": "NO_STORE",
"maxSavedFailedRecordsPerRun": "100",
"maxSavedFailedRecordsTotal": "1000",
"mailAddress": "string",
"mailLevel": "OK",
"lastHarvestFinished": "string",
"lastHarvestStarted": "string",
"lastUpdated": "string",
"nextHarvestSchedule": "string",
"amountHarvested": "string",
"message": "string",
"acl": "string"
}

putHarvestable

Update harvest configuration

path Parameters

id

required

number

Harvest configuration identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/json
required

One of

id

string^[0-9]*$

Unique, numeric ID for the job definition. Will be assigned if not provided.

name

required

string

The name assigned to the harvest configuration.

description

string

Free form description of the configuration to support the administration.

openAccess

string

Enum: "true" "false"

tbd

required

object or object

Reference to the storage configuration to use.

Any of

id

required

string

Reference to the ID of the storage engine to use.

required

object or object

Reference to the transformation pipeline to use.

Any of

id

required

string^[0-9]*$

Reference to the ID of the transformation pipeline to apply.

enabled

required

string

Enum: "true" "false"

Indicates if the job is scheduled for running

harvestImmediately

required

string

Enum: "true" "false"

Whether to harvest when the config is persisted.

scheduleString

string

Crontab style schedule string (simplified): minute(0-59) hour(0-24) day of month(* or 1-31) month (* or 1-12) day of week (* or 0-6).

dateFormat

string

For example yyyy-MM-dd'T'hh:mm:ss'Z'.

url

required

string

The URL to access the data from.

timeout

string^[0-9]*$

Default: "300"

Connection/read timeout in seconds; application depending on the specific protocol used for fetching data.

cacheEnabled

string

Enum: "true" "false"

Whether or not to store incoming records in Harvester's file system.

diskRun

string

Enum: "true" "false"

Whether or not to run harvest job from records cached in a previous job run.

storageBatchLimit

string^[0-9]*$

Batch size: Number of records to send to storage at a time.

recordLimit

string^[0-9]*$

Maximum number of records to harvest.

laxParsing

string

Default: "false"

Enum: "true" "false"

When enabled, Harvester will attempt to parse malformed XML (missing closing tags, entities)

constantFields

string

Values related to target handling in MasterKey. Otherwise obsolete.

storeOriginal

string

Default: "false"

Enum: "true" "false"

Indicates whether to store incoming original record, if supported by the job type and the storage configuration.

managedBy

string

Free-text field for tagging a job with the producer or manager of the resource. Multiple tags may be separated by commas. The tags can be used for filtering status reports by job administrators for example.

usedBy

string

Free form administrative information; could be tags for the clients using this harvestable.

serviceProvider

string

Free-text field for administrative information about the harvest job.

contactNotes

string

Free form text field for administrator's notes.

technicalNotes

string

Free-text field for administrative information.

logLevel

string

Enum: "ERROR" "WARN" "INFO" "DEBUG" "TRACE"

Specifies the logging level for the job with TRACE being the most (extremely) verbose. INFO is the recommended log level in most cases.

failedRecordsLogging

string

Default: "CLEAN_DIRECTORY"

Enum: "NO_STORE" "CLEAN_DIRECTORY" "CREATE_OVERWRITE" "ADD_ALL"

Specify whether or not failed records should be saved as XML files in a designated log directory. Also specifies retention policy for the directory, that is, whether to retain files that were saved in previous runs (CLEAN_DIRECTORY = don't retain) and, if so, whether to overwrite any existing files if the same record fails again (CREATE_OVERWRITE) or rather add a sequence number to the new file name in order not to overwrite (ADD_ALL).

maxSavedFailedRecordsPerRun

string^[0-9]*$

Default: "100"

Sets a maximum number of files to save in the failed records directory per run. The job log will tell when the limit is reached.

maxSavedFailedRecordsTotal

string^[0-9]*$

Default: "1000"

Sets a maximum number of files to be saved in the failed records directory at any given time - as the sum of previously saved records (that were not cleaned up before this run) plus any new records added during the run.The job log will tell when the limit is reached.

mailAddress

string

Comma separated list of e-mail addresses that should receive notification on job completion.

mailLevel

string

Enum: "OK" "WARN" "ERROR"

The minimum severity of a job's completion status that will trigger email notification.

initiallyHarvested

string

Date and time, assigned by Harvester

lastHarvestFinished

string

Assigned by Harvester. The date and time when the most recent harvest job with this configuration completed.

lastHarvestStarted

string

Assigned by Harvester. The date and time when the most recent harvest job with this configuration began.

lastUpdated

string

Assigned by API. The date and time when this definition was last modified.

nextHarvestSchedule

string

The date and time when a job with this definition should be run (if job is enabled).

amountHarvested

string^[0-9]*$

Assigned by API. Number of records harvested in last run. It seems this should really be an integer, but string is what the WSAPI gives us.

message

string

Assigned by API. Message summarising results of last run.

type

required

string

Value: "xmlBulk"

Indicates bulk XML job.

retryCount

string^[0-9]*$

Default: "2"

Obsolete but allowed for XML bulk.

retryWait

string^[0-9]*$

Default: "60"

Obsolete but allowed for XML bulk.

allowErrors

string

Default: "false"

Enum: "true" "false"

Whether or not to continue despite harvest record errors.

allowCondReq

string

Default: "false"

Enum: "true" "false"

Whether or not to filter on file date to only harvest new XML files

fromDate

string

Initial start date (yyyy-MM-dd) for incremental updates (when allowCondReq is 'true')

csvConfiguration

string

Semicolon-separated key-value pairs that specifies parsing of a CSV file into XML for further processing (see Harvester documentation for details).

excludeFilePattern

string

Regular expression; setting to skip harvesting of files with names matching the given regular expression (see Harvester documentation for details).

expectedSchema

string

Mime-type override (e.g: application/marc; charset=MARC-8).

includeFilePattern

string

Regular expression; setting to request harvesting of files with names matching the given regular expression unless those file names are simultaneously excluded by the excludeFilePattern. .zip, .gz, .tar included by default unless explicitly excluded by excludeFilePattern (see Harvester documentation for details).

outputSchema

string

MARC XML transformation format (application/marc or application/tmarc).

passiveMode

string

Default: "false"

Enum: "true" "false"

Whether or not to use passive mode for FTP transfers.

recurse

string

Default: "false"

Enum: "true" "false"

Whether or not to recurse into sub-folders in the source directory tree.

splitAt

string^[0-9]*$

Level/depth to split XML files at to extract records. Zero/empty disables split.

splitSize

string^[0-9]*$

Setting to split large XML files into chunks of `splitSize' number of records; to preserve memory during XSLT transformations.

json

object

Custom configurations in JSON format (has no current applications).

overwrite

string

Enum: "true" "false"

Applies to Solr but not FOLIO Inventory. Will delete all previously harvested data before beginning the next scheduled (or manually triggered) run, if set to true.

keepPartial

string

Enum: "true" "false"

Applies to Solr but not FOLIO Inventory. When true, partial records harvested during a failed harvest run will be retained in Solr.

Responses

Request samples

Payload

Content type

application/json

Example

harvestableXmlBulkPostPut

{"id": "string",
"name": "string",
"description": "string",
"openAccess": "true",
"storage": {"id": "string"
},
"transformation": {"id": "string"
},
"enabled": "true",
"harvestImmediately": "true",
"scheduleString": "string",
"dateFormat": "string",
"url": "string",
"timeout": "300",
"cacheEnabled": "true",
"diskRun": "true",
"storageBatchLimit": "string",
"recordLimit": "string",
"laxParsing": "true",
"constantFields": "string",
"storeOriginal": "true",
"managedBy": "string",
"usedBy": "string",
"serviceProvider": "string",
"contactNotes": "string",
"technicalNotes": "string",
"logLevel": "ERROR",
"failedRecordsLogging": "NO_STORE",
"maxSavedFailedRecordsPerRun": "100",
"maxSavedFailedRecordsTotal": "1000",
"mailAddress": "string",
"mailLevel": "OK",
"initiallyHarvested": "string",
"lastHarvestFinished": "string",
"lastHarvestStarted": "string",
"lastUpdated": "string",
"nextHarvestSchedule": "string",
"amountHarvested": "string",
"message": "string",
"type": "xmlBulk",
"retryCount": "2",
"retryWait": "60",
"allowErrors": "true",
"allowCondReq": "true",
"fromDate": "string",
"csvConfiguration": "string",
"excludeFilePattern": "string",
"expectedSchema": "string",
"includeFilePattern": "string",
"outputSchema": "string",
"passiveMode": "true",
"recurse": "true",
"splitAt": "string",
"splitSize": "string",
"json": { },
"overwrite": "true",
"keepPartial": "true"
}

Response samples

400

Content type

text/plain

No sample

deleteHarvestable

Delete a harvest job configuration

path Parameters

id

required

number

Harvest configuration identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

400

Content type

text/plain

No sample

putHarvestableXmlBulk

Update bulk XML harvest configuration

path Parameters

id

required

number

Harvest configuration identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/json
required

id

string^[0-9]*$

Unique, numeric ID for the job definition. Will be assigned if not provided.

name

required

string

The name assigned to the harvest configuration.

description

string

Free form description of the configuration to support the administration.

openAccess

string

Enum: "true" "false"

tbd

required

object or object

Reference to the storage configuration to use.

Any of

id

required

string

Reference to the ID of the storage engine to use.

required

object or object

Reference to the transformation pipeline to use.

Any of

id

required

string^[0-9]*$

Reference to the ID of the transformation pipeline to apply.

enabled

required

string

Enum: "true" "false"

Indicates if the job is scheduled for running

harvestImmediately

required

string

Enum: "true" "false"

Whether to harvest when the config is persisted.

scheduleString

string

Crontab style schedule string (simplified): minute(0-59) hour(0-24) day of month(* or 1-31) month (* or 1-12) day of week (* or 0-6).

dateFormat

string

For example yyyy-MM-dd'T'hh:mm:ss'Z'.

url

required

string

The URL to access the data from.

timeout

string^[0-9]*$

Default: "300"

Connection/read timeout in seconds; application depending on the specific protocol used for fetching data.

cacheEnabled

string

Enum: "true" "false"

Whether or not to store incoming records in Harvester's file system.

diskRun

string

Enum: "true" "false"

Whether or not to run harvest job from records cached in a previous job run.

storageBatchLimit

string^[0-9]*$

Batch size: Number of records to send to storage at a time.

recordLimit

string^[0-9]*$

Maximum number of records to harvest.

laxParsing

string

Default: "false"

Enum: "true" "false"

When enabled, Harvester will attempt to parse malformed XML (missing closing tags, entities)

constantFields

string

Values related to target handling in MasterKey. Otherwise obsolete.

storeOriginal

string

Default: "false"

Enum: "true" "false"

Indicates whether to store incoming original record, if supported by the job type and the storage configuration.

managedBy

string

Free-text field for tagging a job with the producer or manager of the resource. Multiple tags may be separated by commas. The tags can be used for filtering status reports by job administrators for example.

usedBy

string

Free form administrative information; could be tags for the clients using this harvestable.

serviceProvider

string

Free-text field for administrative information about the harvest job.

contactNotes

string

Free form text field for administrator's notes.

technicalNotes

string

Free-text field for administrative information.

logLevel

string

Enum: "ERROR" "WARN" "INFO" "DEBUG" "TRACE"

Specifies the logging level for the job with TRACE being the most (extremely) verbose. INFO is the recommended log level in most cases.

failedRecordsLogging

string

Default: "CLEAN_DIRECTORY"

Enum: "NO_STORE" "CLEAN_DIRECTORY" "CREATE_OVERWRITE" "ADD_ALL"

Specify whether or not failed records should be saved as XML files in a designated log directory. Also specifies retention policy for the directory, that is, whether to retain files that were saved in previous runs (CLEAN_DIRECTORY = don't retain) and, if so, whether to overwrite any existing files if the same record fails again (CREATE_OVERWRITE) or rather add a sequence number to the new file name in order not to overwrite (ADD_ALL).

maxSavedFailedRecordsPerRun

string^[0-9]*$

Default: "100"

Sets a maximum number of files to save in the failed records directory per run. The job log will tell when the limit is reached.

maxSavedFailedRecordsTotal

string^[0-9]*$

Default: "1000"

Sets a maximum number of files to be saved in the failed records directory at any given time - as the sum of previously saved records (that were not cleaned up before this run) plus any new records added during the run.The job log will tell when the limit is reached.

mailAddress

string

Comma separated list of e-mail addresses that should receive notification on job completion.

mailLevel

string

Enum: "OK" "WARN" "ERROR"

The minimum severity of a job's completion status that will trigger email notification.

initiallyHarvested

string

Date and time, assigned by Harvester

lastHarvestFinished

string

Assigned by Harvester. The date and time when the most recent harvest job with this configuration completed.

lastHarvestStarted

string

Assigned by Harvester. The date and time when the most recent harvest job with this configuration began.

lastUpdated

string

Assigned by API. The date and time when this definition was last modified.

nextHarvestSchedule

string

The date and time when a job with this definition should be run (if job is enabled).

amountHarvested

string^[0-9]*$

Assigned by API. Number of records harvested in last run. It seems this should really be an integer, but string is what the WSAPI gives us.

message

string

Assigned by API. Message summarising results of last run.

type

required

string

Value: "xmlBulk"

Indicates bulk XML job.

retryCount

string^[0-9]*$

Default: "2"

Obsolete but allowed for XML bulk.

retryWait

string^[0-9]*$

Default: "60"

Obsolete but allowed for XML bulk.

allowErrors

string

Default: "false"

Enum: "true" "false"

Whether or not to continue despite harvest record errors.

allowCondReq

string

Default: "false"

Enum: "true" "false"

Whether or not to filter on file date to only harvest new XML files

fromDate

string

Initial start date (yyyy-MM-dd) for incremental updates (when allowCondReq is 'true')

csvConfiguration

string

Semicolon-separated key-value pairs that specifies parsing of a CSV file into XML for further processing (see Harvester documentation for details).

excludeFilePattern

string

Regular expression; setting to skip harvesting of files with names matching the given regular expression (see Harvester documentation for details).

expectedSchema

string

Mime-type override (e.g: application/marc; charset=MARC-8).

includeFilePattern

string

Regular expression; setting to request harvesting of files with names matching the given regular expression unless those file names are simultaneously excluded by the excludeFilePattern. .zip, .gz, .tar included by default unless explicitly excluded by excludeFilePattern (see Harvester documentation for details).

outputSchema

string

MARC XML transformation format (application/marc or application/tmarc).

passiveMode

string

Default: "false"

Enum: "true" "false"

Whether or not to use passive mode for FTP transfers.

recurse

string

Default: "false"

Enum: "true" "false"

Whether or not to recurse into sub-folders in the source directory tree.

splitAt

string^[0-9]*$

Level/depth to split XML files at to extract records. Zero/empty disables split.

splitSize

string^[0-9]*$

Setting to split large XML files into chunks of `splitSize' number of records; to preserve memory during XSLT transformations.

json

object

Custom configurations in JSON format (has no current applications).

overwrite

string

Enum: "true" "false"

Applies to Solr but not FOLIO Inventory. Will delete all previously harvested data before beginning the next scheduled (or manually triggered) run, if set to true.

keepPartial

string

Enum: "true" "false"

Applies to Solr but not FOLIO Inventory. When true, partial records harvested during a failed harvest run will be retained in Solr.

Responses

Request samples

Payload

Content type

application/json

{"id": "string",
"name": "string",
"description": "string",
"openAccess": "true",
"storage": {"id": "string"
},
"transformation": {"id": "string"
},
"enabled": "true",
"harvestImmediately": "true",
"scheduleString": "string",
"dateFormat": "string",
"url": "string",
"timeout": "300",
"cacheEnabled": "true",
"diskRun": "true",
"storageBatchLimit": "string",
"recordLimit": "string",
"laxParsing": "true",
"constantFields": "string",
"storeOriginal": "true",
"managedBy": "string",
"usedBy": "string",
"serviceProvider": "string",
"contactNotes": "string",
"technicalNotes": "string",
"logLevel": "ERROR",
"failedRecordsLogging": "NO_STORE",
"maxSavedFailedRecordsPerRun": "100",
"maxSavedFailedRecordsTotal": "1000",
"mailAddress": "string",
"mailLevel": "OK",
"initiallyHarvested": "string",
"lastHarvestFinished": "string",
"lastHarvestStarted": "string",
"lastUpdated": "string",
"nextHarvestSchedule": "string",
"amountHarvested": "string",
"message": "string",
"type": "xmlBulk",
"retryCount": "2",
"retryWait": "60",
"allowErrors": "true",
"allowCondReq": "true",
"fromDate": "string",
"csvConfiguration": "string",
"excludeFilePattern": "string",
"expectedSchema": "string",
"includeFilePattern": "string",
"outputSchema": "string",
"passiveMode": "true",
"recurse": "true",
"splitAt": "string",
"splitSize": "string",
"json": { },
"overwrite": "true",
"keepPartial": "true"
}

Response samples

400

Content type

text/plain

No sample

putHarvestableOaiPmh

Update OAI-PMH harvest configuration

path Parameters

id

required

number

Harvest configuration identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/json
required

id

string^[0-9]*$

Unique, numeric ID for the job definition. Will be assigned if not provided.

name

required

string

The name assigned to the harvest configuration.

description

string

Free form description of the configuration to support the administration.

openAccess

string

Enum: "true" "false"

tbd

required

object or object

Reference to the storage configuration to use.

Any of

id

required

string

Reference to the ID of the storage engine to use.

required

object or object

Reference to the transformation pipeline to use.

Any of

id

required

string^[0-9]*$

Reference to the ID of the transformation pipeline to apply.

enabled

required

string

Enum: "true" "false"

Indicates if the job is scheduled for running

harvestImmediately

required

string

Enum: "true" "false"

Whether to harvest when the config is persisted.

scheduleString

string

Crontab style schedule string (simplified): minute(0-59) hour(0-24) day of month(* or 1-31) month (* or 1-12) day of week (* or 0-6).

dateFormat

string

For example yyyy-MM-dd'T'hh:mm:ss'Z'.

url

required

string

The URL to access the data from.

timeout

string^[0-9]*$

Default: "300"

Connection/read timeout in seconds; application depending on the specific protocol used for fetching data.

cacheEnabled

string

Enum: "true" "false"

Whether or not to store incoming records in Harvester's file system.

diskRun

string

Enum: "true" "false"

Whether or not to run harvest job from records cached in a previous job run.

storageBatchLimit

string^[0-9]*$

Batch size: Number of records to send to storage at a time.

recordLimit

string^[0-9]*$

Maximum number of records to harvest.

laxParsing

string

Default: "false"

Enum: "true" "false"

When enabled, Harvester will attempt to parse malformed XML (missing closing tags, entities)

constantFields

string

Values related to target handling in MasterKey. Otherwise obsolete.

storeOriginal

string

Default: "false"

Enum: "true" "false"

Indicates whether to store incoming original record, if supported by the job type and the storage configuration.

managedBy

string

Free-text field for tagging a job with the producer or manager of the resource. Multiple tags may be separated by commas. The tags can be used for filtering status reports by job administrators for example.

usedBy

string

Free form administrative information; could be tags for the clients using this harvestable.

serviceProvider

string

Free-text field for administrative information about the harvest job.

contactNotes

string

Free form text field for administrator's notes.

technicalNotes

string

Free-text field for administrative information.

logLevel

string

Enum: "ERROR" "WARN" "INFO" "DEBUG" "TRACE"

Specifies the logging level for the job with TRACE being the most (extremely) verbose. INFO is the recommended log level in most cases.

failedRecordsLogging

string

Default: "CLEAN_DIRECTORY"

Enum: "NO_STORE" "CLEAN_DIRECTORY" "CREATE_OVERWRITE" "ADD_ALL"

Specify whether or not failed records should be saved as XML files in a designated log directory. Also specifies retention policy for the directory, that is, whether to retain files that were saved in previous runs (CLEAN_DIRECTORY = don't retain) and, if so, whether to overwrite any existing files if the same record fails again (CREATE_OVERWRITE) or rather add a sequence number to the new file name in order not to overwrite (ADD_ALL).

maxSavedFailedRecordsPerRun

string^[0-9]*$

Default: "100"

Sets a maximum number of files to save in the failed records directory per run. The job log will tell when the limit is reached.

maxSavedFailedRecordsTotal

string^[0-9]*$

Default: "1000"

Sets a maximum number of files to be saved in the failed records directory at any given time - as the sum of previously saved records (that were not cleaned up before this run) plus any new records added during the run.The job log will tell when the limit is reached.

mailAddress

string

Comma separated list of e-mail addresses that should receive notification on job completion.

mailLevel

string

Enum: "OK" "WARN" "ERROR"

The minimum severity of a job's completion status that will trigger email notification.

lastHarvestFinished

string

Assigned by API. The date and time when the most recent harvest job with this configuration completed.

initiallyHarvested

string

Date and time, assigned by Harvester

lastHarvestStarted

string

Assigned by Harvester. The date and time when the most recent harvest job with this configuration began.

lastUpdated

string

Assigned by Harvester. The date and time when this definition was last modified.

nextHarvestSchedule

string

The date and time when a job with this definition should be run (if job is enabled).

amountHarvested

string^[0-9]*$

Assigned by API. Number of records harvested in last run. It seems this should really be an integer, but string is what the WSAPI gives us.

message

string

Assigned by API. Message summarising results of last run.

type

required

string

Value: "oaiPmh"

Indicates OAI-PMH job.

metadataPrefix

required

string

OAI-PMH only. The metadata prefix supported by the OAI-PMH service to harvest from.

oaiSetName

required

string

OAI-PMH only. The name of a record set offered by the OAI-PMH service to harvest from.

resumptionToken

string

OAI-PMH only. PMH identifier for fetching the next batch of records.

clearRtOnError

string

Default: "false"

Enum: "true" "false"

Clear the resumption token for harvests that complete in an error state. This is useful when server errors out and the last resumption token is no longer valid.

fromDate

string

yyyy-MM-dd. If empty and no resumption token is set, the Harvester will harvest the full data set from the resource. When this field contains a value, upon completion of the job the Harvester will reset the value of this field to the day prior to the current run date, so subsequent runs will harvest only new records.

untilDate

string

yyyy-MM-dd. Upper date limit for selective harvesting. On consecutive runs the Harvester will clear this field making the date interval open-ended.

retryCount

string^[0-9]*$

Default: "2"

Indicates how many times Harvester should retry a failed OAI-PMH request.

retryWait

string^[0-9]*$

Default: "60"

Indicates how many seconds Harvester should wait before retrying a failed OAI-PMH request.

allowErrors

string

Default: "false"

Enum: "true" "false"

NA for OAI-PMH

json

object

Custom configurations in JSON format (has no current applications).

overwrite

string

Enum: "true" "false"

Applies to Solr but not FOLIO Inventory. Will delete all previously harvested data before beginning the next scheduled (or manually triggered) run, if set to true.

keepPartial

string

Enum: "true" "false"

Applies to Solr but not FOLIO Inventory. When true, partial records harvested during a failed harvest run will be retained in Solr.

Responses

Request samples

Payload

Content type

application/json

{"id": "string",
"name": "string",
"description": "string",
"openAccess": "true",
"storage": {"id": "string"
},
"transformation": {"id": "string"
},
"enabled": "true",
"harvestImmediately": "true",
"scheduleString": "string",
"dateFormat": "string",
"url": "string",
"timeout": "300",
"cacheEnabled": "true",
"diskRun": "true",
"storageBatchLimit": "string",
"recordLimit": "string",
"laxParsing": "true",
"constantFields": "string",
"storeOriginal": "true",
"managedBy": "string",
"usedBy": "string",
"serviceProvider": "string",
"contactNotes": "string",
"technicalNotes": "string",
"logLevel": "ERROR",
"failedRecordsLogging": "NO_STORE",
"maxSavedFailedRecordsPerRun": "100",
"maxSavedFailedRecordsTotal": "1000",
"mailAddress": "string",
"mailLevel": "OK",
"lastHarvestFinished": "string",
"initiallyHarvested": "string",
"lastHarvestStarted": "string",
"lastUpdated": "string",
"nextHarvestSchedule": "string",
"amountHarvested": "string",
"message": "string",
"type": "oaiPmh",
"metadataPrefix": "string",
"oaiSetName": "string",
"resumptionToken": "string",
"clearRtOnError": "true",
"fromDate": "string",
"untilDate": "string",
"retryCount": "2",
"retryWait": "60",
"allowErrors": "true",
"json": { },
"overwrite": "true",
"keepPartial": "true"
}

Response samples

400

Content type

text/plain

No sample

startJob

Starts a harvest job immediately if possible

path Parameters

id

required

number

Harvest configuration identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"harvestableId": 0,
"name": "string",
"initiated": "string"
}

stopJob

Stops a harvest job

path Parameters

id

required

number

Harvest configuration identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

201
400

Content type

application/json

{ }

getJobLog

Get log statements for a harvest job

path Parameters

id

required

number

Harvest configuration identifier

query Parameters

offset	string log file start line
limit	string max log file lines

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

400

Content type

text/plain

No sample

getFailedRecords

Get failed records for a harvest job

path Parameters

id

required

number

Harvest configuration identifier

query Parameters

offset	string result set start row
limit	string result set max rows

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"failedRecords": [{"recordErrors": [{"error": {"label": "string",
"typeOfError": "string",
"typeOfRecord": "string",
"transaction": "string",
"message": { },
"entity": { }
}
}
],
"original": "string",
"transformedRecord": { },
"timeStamp": "string",
"recordNumber": "string",
"harvestableId": "string"
}
]
}

getFailedRecord

Get a failed record for a harvest job

path Parameters

id required	number Harvest configuration identifier
num required	string number of a failed record

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"recordErrors": [{"error": {"label": "string",
"typeOfError": "string",
"typeOfRecord": "string",
"transaction": "string",
"message": { },
"entity": { }
}
}
],
"original": "string",
"transformedRecord": { },
"timeStamp": "string",
"recordNumber": "string",
"harvestableId": "string"
}

storeJobLogWithPostedStatus

Takes submitted job status, pulls the job config, and stores a copy of its most recent logs

path Parameters

id

required

number

Harvest configuration identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/json
required

status required	string Enum: "NEW" "OK" "WARN" "ERROR" "RUNNING" "FINISHED" "KILLED" The outcome of the harvester job according to Harvester.
finished	string ISO formatted timestamp for when the job finished.
started	string ISO formatted timestamp for when the job started.
amountHarvested	string The number of records harvested in the harvest run.
message required	string Status message for the outcome of the harvest run.

Responses

Request samples

Payload

Content type

application/json

{"status": "NEW",
"finished": "string",
"started": "string",
"amountHarvested": "string",
"message": "string"
}

Response samples

400

Content type

text/plain

No sample

storeJobLog

Pulls the current job config from Harvester and stores a copy of the most recent log for that job

path Parameters

id

required

number

Harvest configuration identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

400

Content type

text/plain

No sample

getPreviousJobs

Retrieves list of previous harvest jobs

query Parameters

query	string CQL query, supporting harvestableId, name, type, status, message, and amountHarvested in queries, and the same fields plus started and finished in sorting
offset	string result set start row
limit	string result set max rows
from	string date range start parameter on finished date
until	string date range end parameter on finished date

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"previousJobs": [{"id": "string",
"name": "string",
"harvestableId": 0,
"type": "string",
"url": "string",
"allowErrors": true,
"recordLimit": 0,
"transformation": "string",
"storage": "string",
"status": "string",
"started": "string",
"finished": "string",
"amountHarvested": 0,
"message": "string"
}
]
}

postPreviousJob

Create job log samples for test purposes etc, for example by import from another FOLIO instance.

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/json
required

id	string unique identifier for this report of a single harvest run - a UUID
name	string The name of the harvest configuration at the time of logging the harvest run.
harvestableId	integer Unique identifier for the harvest configuration.
type	string The type of harvest job (bulk XML or OAI-PMH)
url	string The URL(s) used for retrieving the records that were harvested during this job.
allowErrors	boolean Indicates whether the job was configured to continue in case of (certain classes of) errors.
recordLimit	integer Indicates the limit -- if any -- on the maximum number of records to load according to the configuration.
transformation	string The name of the transformation pipeline that was used for the harvest job.
storage	string The name of the storage that was used for persisting the records harvested during the job.
status	string The outcome of the job. This would usually be the status after the job finished but it's possible to retrieve a history entry for a still running job.
started	string Timestamp indicating when the job began.
finished	string Timestamp indicating when the job completed.
amountHarvested	integer The number of (incoming) records that were processed.
message	string A description of the outcome of the harvest job, for example update statistics or a fatal error.

Responses

Request samples

Payload

Content type

application/json

{"id": "string",
"name": "string",
"harvestableId": 0,
"type": "string",
"url": "string",
"allowErrors": true,
"recordLimit": 0,
"transformation": "string",
"storage": "string",
"status": "string",
"started": "string",
"finished": "string",
"amountHarvested": 0,
"message": "string"
}

Response samples

201
400

Content type

application/json

{"id": "string",
"name": "string",
"harvestableId": 0,
"type": "string",
"url": "string",
"allowErrors": true,
"recordLimit": 0,
"transformation": "string",
"storage": "string",
"status": "string",
"started": "string",
"finished": "string",
"amountHarvested": 0,
"message": "string"
}

getFailedRecordsForPreviousJobs

Retrieves the failed records of previous harvest jobs

query Parameters

query	string CQL query, supporting recordNumber, harvestableId, harvestableName in queries
offset	string result set start row
limit	string result set max rows
from	string date range parameter on error report timestamp
until	string date range parameter on error report timestamp

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"failedRecords": [{"id": "string",
"harvestJobId": "string",
"harvestableId": 0,
"harvestableName": "string",
"recordErrors": [{"error": {"label": "string",
"typeOfError": { },
"typeOfRecord": "string",
"transaction": "string",
"message": { },
"entity": { }
}
}
],
"original": "string",
"transformedRecord": { },
"timeStamp": "string",
"recordNumber": "string"
}
]
}

getPreviousJob

Retrieves details of a previous harvest job

path Parameters

id

required

string <uuid>

Harvest job identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"id": "string",
"name": "string",
"harvestableId": 0,
"type": "string",
"url": "string",
"allowErrors": true,
"recordLimit": 0,
"transformation": "string",
"storage": "string",
"status": "string",
"started": "string",
"finished": "string",
"amountHarvested": 0,
"message": "string"
}

deletePreviousJob

Delete a previous job run with all its logs

path Parameters

id

required

string <uuid>

Harvest job identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

400

Content type

text/plain

No sample

getPreviousJobLog

Retrieves the log of a previous harvest job

path Parameters

id

required

string <uuid>

Harvest job identifier

query Parameters

query

string

CQL, supporting harvestJobId, logLevel, jobLabel, line in query terms

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

text/plain

No sample

postPreviousJobLog

Backdoor for creating logs of a previous harvest job without running a job

path Parameters

id

required

string <uuid>

Harvest job identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: text/plain
required

string

Responses

Response samples

400

Content type

text/plain

No sample

getFailedRecordsForPreviousJob

Retrieves the failed records of a previous harvest job

path Parameters

id

required

string <uuid>

Harvest job identifier

query Parameters

query	string CQL query, supporting recordNumber, harvestableId, harvestableName in queries
from	string date range parameter on error report timestamp
until	string date range parameter on error report timestamp
offset	string result set start row
limit	string result set max rows

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"failedRecords": [{"id": "string",
"harvestJobId": "string",
"harvestableId": 0,
"harvestableName": "string",
"recordErrors": [{"error": {"label": "string",
"typeOfError": { },
"typeOfRecord": "string",
"transaction": "string",
"message": { },
"entity": { }
}
}
],
"original": "string",
"transformedRecord": { },
"timeStamp": "string",
"recordNumber": "string"
}
]
}

postFailedRecords

Create failed record samples without running a job, for example to import from another FOLIO instance.

path Parameters

id

required

string <uuid>

Harvest job identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/json
required

Array of objects (failedRecordPreviousJob)

List of failed records created by a previous harvest job run.

Array

id	string Unique ID (uuid) for this error report.
harvestJobId	string Identifier (uuid) of the harvest job run creating the error report
harvestableId	integer Identifier (numeric) of the harvest configuration (harvestable), by which the job was run.
harvestableName	string Name of the harvest configuration (harvestable), by which the job was run.
	Array of objects List of errors encountered during upsert or delete.
original	string I.e. the XML of the incoming record before transformation to an Inventory record set.
transformedRecord	object The JSON outcome of the transformation of the original record.
timeStamp	string The time the error occurred, Day Mon DD HH24:mi:ss TZ yyyy
recordNumber	string The identifier assigned to this error report by Harvester

Responses

Request samples

Payload

Content type

application/json

{"failedRecords": [{"id": "string",
"harvestJobId": "string",
"harvestableId": 0,
"harvestableName": "string",
"recordErrors": [{"error": {"label": "string",
"typeOfError": { },
"typeOfRecord": "string",
"transaction": "string",
"message": { },
"entity": { }
}
}
],
"original": "string",
"transformedRecord": { },
"timeStamp": "string",
"recordNumber": "string"
}
]
}

Response samples

400

Content type

text/plain

No sample

getFailedRecordForPreviousJob

Retrieves a failed record of a previous harvest job

path Parameters

id

required

string <uuid>

UUID of the failed-record object

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

400

Content type

text/plain

No sample

postStorage

Create storage configuration

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/json
required

One of

type	string Value: "inventoryStorage" Type of storage.
json required	object Storage configuration parameters in JSON.
id	string Unique storage identifier.
name	string Name of the storage definition.
description	string Free text details about the storage definition.
enabled	string Default: "false" Enum: "true" "false" Boolean string to indicate if the storage definition can be used.
url	string Address of the storage service.

Responses

Request samples

Payload

Content type

application/json

{"type": "solrStorage",
"id": "string",
"name": "string",
"description": "string",
"enabled": "true",
"url": "string"
}

Response samples

201
400

Content type

application/json

{"type": "solrStorage",
"acl": "string",
"id": "string",
"name": "string",
"description": "string",
"enabled": "true",
"bulkSize": "string",
"currentStatus": "string",
"url": "string"
}

getStorages

Get brief storage definitions

query Parameters

query

string

CQL

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"storages": [{"id": 0,
"name": "string",
"enabled": "true",
"description": "string"
}
],
"totalRecords": 0
}

getStorage

Get storage definition

path Parameters

id

required

integer

Storage definition identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"type": "solrStorage",
"acl": "string",
"id": "string",
"name": "string",
"description": "string",
"enabled": "true",
"bulkSize": "string",
"currentStatus": "string",
"url": "string"
}

putStorage

Update storage definition

path Parameters

id

required

integer

Storage definition identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/json
required

One of

type	string Value: "inventoryStorage" Type of storage.
json required	object Storage configuration parameters in JSON.
id	string Unique storage identifier.
name	string Name of the storage definition.
description	string Free text details about the storage definition.
enabled	string Default: "false" Enum: "true" "false" Boolean string to indicate if the storage definition can be used.
url	string Address of the storage service.

Responses

Request samples

Payload

Content type

application/json

{"type": "solrStorage",
"id": "string",
"name": "string",
"description": "string",
"enabled": "true",
"url": "string"
}

Response samples

400

Content type

text/plain

No sample

deleteStorage

Delete a storage definition

path Parameters

id

required

integer

Storage definition identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

postTransformation

Create transformation pipeline

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/json
required

id

string

Unique record identifier.

name

required

string

Name of the transformation pipeline.

description

string

Details about the pipeline.

type

required

string

Enum: "basicTransformation" "customTransformation"

The type of transformation pipeline.

enabled

string

Enum: "true" "false"

Indicates if the transformation pipeline can be used by harvest jobs

parallel

string

Default: "false"

Enum: "true" "false"

Indicates if steps should be run concurrently (each in its own thread).

Array of objects or objects or objects

List of steps that make up the transformation pipeline. In a POST this will be used for attaching the steps to the pipeline. In a PUT this is ignored.

Array

Any of

position	string The steps position in the sequence of steps, starting with number 1.
required	object or object Existing transformation step to include in pipeline, referenced by ID or step name.

Responses

Request samples

Payload

Content type

application/json

{"id": "string",
"name": "string",
"description": "string",
"type": "basicTransformation",
"enabled": "true",
"parallel": "true",
"stepAssociations": [{"position": "string",
"step": {"id": "string"
}
}
]
}

Response samples

201
400

Content type

application/json

{"id": "string",
"name": "string",
"description": "string",
"type": "basicTransformation",
"enabled": "true",
"parallel": "true",
"stepAssociations": [{"id": "string",
"position": "string",
"step": {"entityType": "xmlTransformationStep",
"acl": "string",
"description": "string",
"inputFormat": "string",
"name": "string",
"outputFormat": "string",
"script": "<'script' omitted from nested displays>",
"customClass": "string",
"id": "string",
"testData": "<'testData' omitted from nested displays>",
"testOutput": "<'testOutput' omitted from nested displays>"
},
"transformation": "string"
}
]
}

getTransformations

Get brief transformation definitions

query Parameters

query

string

CQL

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"transformations": [{"id": 0,
"name": "string",
"description": "string",
"inputFormat": "string",
"outputFormat": "string",
"type": "string"
}
],
"totalRecords": 0
}

getTransformation

Get transformation pipeline

path Parameters

id

required

integer

Transformation pipeline identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"id": "string",
"name": "string",
"description": "string",
"type": "basicTransformation",
"enabled": "true",
"parallel": "true",
"stepAssociations": [{"id": "string",
"position": "string",
"step": {"entityType": "xmlTransformationStep",
"acl": "string",
"description": "string",
"inputFormat": "string",
"name": "string",
"outputFormat": "string",
"script": "<'script' omitted from nested displays>",
"customClass": "string",
"id": "string",
"testData": "<'testData' omitted from nested displays>",
"testOutput": "<'testOutput' omitted from nested displays>"
},
"transformation": "string"
}
]
}

putTransformation

Update transformation pipeline

path Parameters

id

required

integer

Transformation pipeline identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

400

Content type

text/plain

No sample

deleteTransformation

Delete a transformation pipeline

path Parameters

id

required

integer

Transformation pipeline identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

getSteps

Get brief transformation step definition records

Responses

Response samples

200

Content type

application/json

{"steps": [{"id": 0,
"name": "string",
"enabled": "string",
"description": "string"
}
],
"totalRecords": 0
}

postStep

Create new transformation step definition

Request Body schema: application/json

id	string Unique identifier for the transformation step.
name required	string A name assigned to the transformation step.
enabled	string Default: "false" Indicates if this step is available to be used in a transformation pipeline.
description	string Additional descriptions of the storage definition.
type required	string Enum: "XmlTransformStep" "CustomTransformStep" Type of transformation step.
inputFormat	string Free-text indication of the format of input data to the step.
outputFormat	string Free-text indication of the format of the resulting output from the step.
testData	string Sample input data for testing.
testOutput	string Output from testing using the sample test-data.
customClass	string Only CustomTransformSteps: fully qualified class name of the class performing the transformation.
script	string Transformation script, typically XSLT.

Responses

Request samples

Payload

Content type

application/json

{"id": "string",
"name": "string",
"enabled": "false",
"description": "string",
"type": "XmlTransformStep",
"inputFormat": "string",
"outputFormat": "string",
"testData": "string",
"testOutput": "string",
"customClass": "string",
"script": "string"
}

Response samples

201
400

Content type

application/json

{"acl": "string",
"id": "string",
"name": "string",
"enabled": "false",
"description": "string",
"type": "XmlTransformStep",
"inputFormat": "string",
"outputFormat": "string",
"testData": "string",
"testOutput": "string",
"customClass": "string",
"script": "string"
}

deleteSteps

Delete all transformation step definitions

Responses

getStep

Get detailed transformation step definition record

path Parameters

id

required

string

Step identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"acl": "string",
"id": "string",
"name": "string",
"enabled": "false",
"description": "string",
"type": "XmlTransformStep",
"inputFormat": "string",
"outputFormat": "string",
"testData": "string",
"testOutput": "string",
"customClass": "string",
"script": "string"
}

putStep

Update a transformation step definition

path Parameters

id

required

string

Step identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/json

id	string Unique identifier for the transformation step.
name required	string A name assigned to the transformation step.
enabled	string Default: "false" Indicates if this step is available to be used in a transformation pipeline.
description	string Additional descriptions of the storage definition.
type required	string Enum: "XmlTransformStep" "CustomTransformStep" Type of transformation step.
inputFormat	string Free-text indication of the format of input data to the step.
outputFormat	string Free-text indication of the format of the resulting output from the step.
testData	string Sample input data for testing.
testOutput	string Output from testing using the sample test-data.
customClass	string Only CustomTransformSteps: fully qualified class name of the class performing the transformation.
script	string Transformation script, typically XSLT.

Responses

Request samples

Payload

Content type

application/json

{"id": "string",
"name": "string",
"enabled": "false",
"description": "string",
"type": "XmlTransformStep",
"inputFormat": "string",
"outputFormat": "string",
"testData": "string",
"testOutput": "string",
"customClass": "string",
"script": "string"
}

Response samples

200
400

Content type

application/json

{"acl": "string",
"id": "string",
"name": "string",
"enabled": "false",
"description": "string",
"type": "XmlTransformStep",
"inputFormat": "string",
"outputFormat": "string",
"testData": "string",
"testOutput": "string",
"customClass": "string",
"script": "string"
}

deleteStep

Delete a transformation step definition

path Parameters

id

required

string

Step identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

getScript

Get transformation step script

path Parameters

id

required

string

Step identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

putScript

Update a transformation step script

path Parameters

id

required

string

Step identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/xml

property name*

additional property

any

Responses

Response samples

400

Content type

text/plain

No sample

getTsas

Get transformation step associations

Responses

Response samples

200

Content type

application/json

{"transformationSteps": [{"id": "string",
"step": {"id": "string",
"name": "string"
},
"transformation": "string",
"transformationName": "string",
"position": "string"
}
],
"totalRecords": 0
}

postTsa

Create new transformation step association

Request Body schema: application/json

Any of

id

string

unique identifier for the association

required

object or object

contains id of step that is associated with a pipeline

Any of

id required	string Id of the step associated with a pipeline.
name	string Name for the step associated with a pipeline.
property name* additional property	any

transformation

required

string

Id of the transformation pipeline that the step is associated with.

transformationName

string

Transient. Optional alternative to the id for looking up the transformation to attach the step to.

position

required

string

The position of the step amongst other transformation steps in the pipeline.

Responses

Request samples

Payload

Content type

application/json

{"id": "string",
"step": {"id": "string",
"name": "string"
},
"transformation": "string",
"transformationName": "string",
"position": "string"
}

Response samples

201
400

Content type

application/json

{"id": "string",
"step": {"id": "string",
"name": "string"
},
"transformation": "string",
"transformationName": "string",
"position": "string"
}

deleteTsas

Delete all transformation step associations

Responses

getTsa

Get a transformation step association

path Parameters

id

required

string

Association identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

Response samples

200
400

Content type

application/json

{"id": "string",
"step": {"id": "string",
"name": "string"
},
"transformation": "string",
"transformationName": "string",
"position": "string"
}

putTsa

Update a transformation step association

path Parameters

id

required

string

Association identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Request Body schema: application/json

Any of

id

string

unique identifier for the association

required

object or object

contains id of step that is associated with a pipeline

Any of

id required	string Id of the step associated with a pipeline.
name	string Name for the step associated with a pipeline.
property name* additional property	any

transformation

required

string

Id of the transformation pipeline that the step is associated with.

transformationName

string

Transient. Optional alternative to the id for looking up the transformation to attach the step to.

position

required

string

The position of the step amongst other transformation steps in the pipeline.

Responses

Request samples

Payload

Content type

application/json

{"id": "string",
"step": {"id": "string",
"name": "string"
},
"transformation": "string",
"transformationName": "string",
"position": "string"
}

Response samples

200
400

Content type

application/json

{"id": "string",
"step": {"id": "string",
"name": "string"
},
"transformation": "string",
"transformationName": "string",
"position": "string"
}

deleteTsa

Delete a transformation step association

path Parameters

id

required

string

Association identifier

header Parameters

X-Okapi-Tenant	string Okapi Tenant
X-Okapi-Token	string Okapi Token
X-Okapi-Url	string Okapi URL

Responses

getIds

Get up to 100 random 15 digit numbers

query Parameters

count

integer

integer, max 100

Responses

purgeAgedLogs

Delete old harvest logs from storage

Harvester Admin API (v0.1)

postHarvestable

header Parameters

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

getHarvestables

query Parameters

header Parameters

Responses

Response samples

postHarvestableXmlBulk

header Parameters

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

postHarvestableOaiPmh

header Parameters

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

getHarvestable

path Parameters

header Parameters

Responses

Response samples

putHarvestable

path Parameters

header Parameters

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

deleteHarvestable

path Parameters

header Parameters

Responses

Response samples

putHarvestableXmlBulk

path Parameters

header Parameters

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

putHarvestableOaiPmh

path Parameters

header Parameters

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

startJob

path Parameters

header Parameters

Responses

Response samples

stopJob

path Parameters

header Parameters

Responses

Response samples

getJobLog

path Parameters

query Parameters

header Parameters

Responses

Response samples

getFailedRecords

path Parameters

query Parameters

header Parameters

Responses

Response samples

getFailedRecord

path Parameters

header Parameters

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: text/plain
required

Request Body schema: application/json
required