Model storage
Store models and other files on S3 in particular using Minio
- Connectivity to the S3 backend is done by setting environment variables
BLACKBAR_S3_ENDPOINT
,BLACKBAR_S3_ACCESS_KEY_ID
,BLACKBAR_S3_SECRET_ACCESS_KEY
,BLACKBAR_S3_SSL_VERIFY
- Once these are set, you can perform the S3 operations
from rlike import *
environ = {
"BLACKBAR_S3_ENDPOINT": "blackbar.datatailor.be",
"BLACKBAR_S3_ACCESS_KEY_ID": "XXXXXXXXXX",
"BLACKBAR_S3_SECRET_ACCESS_KEY": "XXXXXXXXXX",
"BLACKBAR_S3_SSL_VERIFY": "true"}
Sys_setenv(environ)
List / Upload / Download / Remove
blackbar.s3.blackbar_s3_list(type='buckets', bucket='blackbar-models', recursive=True)
Get the list of buckets or files on a bucket
Parameters:
Name | Type | Description | Default |
---|---|---|---|
type
|
str
|
Either 'buckets' or 'objects' |
'buckets'
|
bucket
|
str
|
Bucket name |
'blackbar-models'
|
recursive
|
bool
|
Boolean indicating if files need to be fetched recursively |
True
|
Returns:
Type | Description |
---|---|
DataFrame
|
in case of type: 'buckets' a pandas dataframe with columns: bucket, creation_date and versioning |
DataFrame
|
in case of type: 'objects' a pandas dataframe with columns: bucket, object_name, owner_name, size, content_type, metadata, last_modified, version_id |
Examples:
>>> from blackbar import *
>>> x = blackbar_s3_list(type = "buckets")
>>> x = blackbar_s3_list(type = "objects", bucket = "blackbar-models", recursive = True)
>>> x = blackbar_s3_list(type = "objects", bucket = "bnosac", recursive = True)
blackbar.s3.blackbar_s3_upload(object, name, bucket='blackbar-models', metadata={}, project='default')
Upload a blackbar model on an S3 bucket (e.g. Minio/S3)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
object
|
a spacy model |
required | |
name
|
str
|
String with the name of the object |
required |
bucket
|
str
|
Bucket where the models will be stored |
'blackbar-models'
|
metadata
|
dict
|
Dictionary with the metadata to save in the S3 object |
{}
|
project
|
str
|
String with the name of the project which will be put as tag in the S3 object |
'default'
|
Returns:
Type | Description |
---|---|
dict
|
a dict with elements: bucket, name, size, url |
Examples:
>>> from blackbar import *
>>> import spacy
>>> nlp = spacy.blank('nl')
>>> msg = blackbar_s3_upload(nlp, name = "test", bucket = "blackbar-models")
>>> p = blackbar_model_save(nlp, type = 'zip')
>>> msg = blackbar_s3_upload(p, name = "test", bucket = "bnosac")
blackbar.s3.blackbar_s3_download(name, bucket='blackbar-models', folder=tempdir(), version_id=None)
Download a blackbar model from an S3 bucket (e.g. Minio/S3)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
String with the name of the object |
required |
bucket
|
str
|
Bucket where the models are stored |
'blackbar-models'
|
folder
|
str
|
Path where the model will be downloaded to |
tempdir()
|
version_id
|
str
|
Version id of the object (optional) |
None
|
Returns:
Type | Description |
---|---|
dict
|
a dict with elements: bucket, name, model_name, tags, metadata, size, folder, files |
Examples:
>>> from blackbar import *
>>> import spacy
>>> nlp = spacy.blank('nl')
>>> info = blackbar_s3_upload(nlp, name = "test", bucket = "blackbar-models")
>>> info = blackbar_s3_upload(nlp, name = "test", bucket = "blackbar-models")
>>> info = blackbar_s3_download(name = "test", bucket = "blackbar-models")
>>> info = blackbar_s3_download(name = "test", bucket = "blackbar-models", version_id = "")
>>> info = blackbar_s3_download(name = "test", bucket = "blackbar-models", folder = tempdir())
>>> model = blackbar_model_load(info)
blackbar.s3.s3_remove(type='objects', name=None, bucket='blackbar-models', version_id=None)
Remove an object or bucket from S3
Parameters:
Name | Type | Description | Default |
---|---|---|---|
type
|
str
|
Either 'buckets' or 'objects' |
'objects'
|
bucket
|
str
|
Bucket name |
'blackbar-models'
|
name
|
str
|
String with the name of the object |
None
|
version_id
|
str
|
Version id of the object (optional) |
None
|
Returns:
Type | Description |
---|---|
Nothing, deletes the object from S3 |
Examples:
>>> from blackbar import *
>>> x = blackbar_s3_write({"test": "test", "abc": [123, 456]}, bucket = "bnosac", name = "testci")
>>> x = blackbar_s3_read(bucket = "bnosac", name = "testci")
>>> x
{'test': 'test', 'abc': [123, 456]}
>>> s3_remove(bucket = "bnosac", name = "testci")
blackbar.s3.s3_upload_file(object, name, bucket='blackbar-models', metadata={}, project='default')
Upload a file on an S3 bucket (e.g. Minio/S3)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
object
|
the path to a file on disk |
required | |
name
|
str
|
String with the name of the object |
required |
bucket
|
str
|
Bucket where the models will be stored |
'blackbar-models'
|
metadata
|
dict
|
Dictionary with the metadata to save in the S3 object |
{}
|
project
|
str
|
String with the name of the project which will be put as tag in the S3 object |
'default'
|
Returns:
Type | Description |
---|---|
dict
|
a dict with elements: bucket, name, size, url |
Examples:
>>> from blackbar import *
>>> import spacy
>>> nlp = spacy.blank('nl')
>>> p = blackbar_model_save(nlp, type = 'zip')
>>> p = p['path']
>>> msg = s3_upload_file(p, name = "test", bucket = "bnosac")
blackbar.s3.s3_download_file(name, bucket='blackbar-models', folder=tempdir(), filename=None, version_id=None)
Download a file from an S3 bucket (e.g. Minio/S3)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
String with the name of the object |
required |
bucket
|
str
|
Bucket where the file is stored |
'blackbar-models'
|
folder
|
str
|
Path where the file will be downloaded to |
tempdir()
|
filename
|
str
|
If provided, saves the file with this filename in the folder, otherwise, uses the name provided in name |
None
|
version_id
|
str
|
Version id of the object (optional) |
None
|
Returns:
Type | Description |
---|---|
dict
|
a dict with elements: bucket, name, tags, metadata, size, folder, files |
Examples:
>>> from blackbar import *
>>> from rlike import *
>>> import spacy
>>> nlp = spacy.blank('nl')
>>> p = blackbar_model_save(nlp, type = 'zip')
>>> p = p['path']
>>> msg = s3_upload_file(p, name = "test", bucket = "bnosac")
>>> msg = s3_download_file(name = "test", bucket = "bnosac")
>>> msg = file_remove(file_path(msg["folder"], msg["files"]))
>>> p = savePickle("hello this is some text", "myfile.pickle")
>>> msg = s3_upload_file(p, name = "test", bucket = "bnosac")
>>> msg = s3_download_file(name = "test", bucket = "bnosac", filename = "myfile_downloaded.pickle")
>>> readPickle(file_path(msg["folder"], msg["files"]))
'hello this is some text'
>>> msg = file_remove(["myfile.pickle", file_path(msg["folder"], msg["files"])])
Save / Load blackbar models
blackbar.s3.blackbar_model_save(object, type='zip', path=tempfile(pattern='blackbar-', fileext='.zip'))
Save a blackbar model as 1 file to the harddisk
Parameters:
Name | Type | Description | Default |
---|---|---|---|
object
|
a spacy model or other future models |
required | |
type
|
str
|
type, currently only 'zip' is supported |
'zip'
|
path
|
str
|
Path to where the file will be stored |
tempfile(pattern='blackbar-', fileext='.zip')
|
Returns:
Type | Description |
---|---|
dict
|
a dictionary with elements metadata, type, path and size |
Examples:
>>> from blackbar import *
>>> from rlike import *
>>> import spacy
>>> nlp = spacy.blank('nl')
>>> p = blackbar_model_save(nlp, type = 'zip')
>>> done = file_remove(p['path'])
>>> p = blackbar_model_save(nlp, type = 'zip', path = 'mymodel.zip')
>>> done = file_remove(p['path'])
blackbar.s3.blackbar_model_load(object)
Load a blackbar model which was downloaded from an S3 bucket (e.g. Minio/S3)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
object
|
An object as returned by blackbar_s3_download |
required |
Returns:
Type | Description |
---|---|
TODO |
Examples:
>>> from blackbar import *
>>> from rlike import *
>>> import spacy
>>> nlp = spacy.blank('nl')
>>> info = blackbar_s3_upload(nlp, name = "test", bucket = "blackbar-models")
>>> info = blackbar_s3_download(name = "test", bucket = "blackbar-models")
>>> model = blackbar_model_load(info)
Save / Load pickled data
blackbar.s3.blackbar_s3_read(name, bucket='blackbar-models', type='pickle', version_id=None)
Read a dataset from S3
Parameters:
Name | Type | Description | Default |
---|---|---|---|
bucket
|
str
|
Bucket name |
'blackbar-models'
|
name
|
str
|
String with the name of the object |
required |
type
|
str
|
Currently only 'pickle' |
'pickle'
|
version_id
|
str
|
Version id of the object (optional) |
None
|
Returns:
Type | Description |
---|---|
The result of reading the pickled data |
Examples:
>>> from blackbar import *
>>> x = blackbar_s3_write({"test": "test", "abc": [123, 456]}, bucket = "bnosac", name = "testci")
>>> x = blackbar_s3_read(bucket = "bnosac", name = "testci")
>>> x
{'test': 'test', 'abc': [123, 456]}
>>> s3_remove(bucket = "bnosac", name = "testci")
blackbar.s3.blackbar_s3_write(data, name, bucket='blackbar-models', type='pickle')
Write a dataset to S3
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Any
|
An object which can be pickled |
required |
bucket
|
str
|
Bucket name |
'blackbar-models'
|
name
|
str
|
String with the name of the object |
required |
type
|
str
|
Currently only 'pickle' |
'pickle'
|
Returns:
Type | Description |
---|---|
A list with elements bucket, name, size, url |
Examples:
>>> from blackbar import *
>>> x = blackbar_s3_write({"test": "test", "abc": [123, 456]}, bucket = "bnosac", name = "testci")
>>> x = blackbar_s3_read(bucket = "bnosac", name = "testci")
>>> x
{'test': 'test', 'abc': [123, 456]}
>>> s3_remove(bucket = "bnosac", name = "testci")