Skip to content

Model storage

Store models and other files on S3 in particular using Minio

  • Connectivity to the S3 backend is done by setting environment variables BLACKBAR_S3_ENDPOINT, BLACKBAR_S3_ACCESS_KEY_ID, BLACKBAR_S3_SECRET_ACCESS_KEY, BLACKBAR_S3_SSL_VERIFY
  • Once these are set, you can perform the S3 operations
from rlike import *
environ = {
    "BLACKBAR_S3_ENDPOINT": "blackbar.datatailor.be",
    "BLACKBAR_S3_ACCESS_KEY_ID": "XXXXXXXXXX", 
    "BLACKBAR_S3_SECRET_ACCESS_KEY": "XXXXXXXXXX",
    "BLACKBAR_S3_SSL_VERIFY": "true"}
Sys_setenv(environ)

List / Upload / Download / Remove

blackbar.s3.blackbar_s3_list(type='buckets', bucket='blackbar-models', recursive=True)

Get the list of buckets or files on a bucket

Parameters:

Name Type Description Default
type str

Either 'buckets' or 'objects'

'buckets'
bucket str

Bucket name

'blackbar-models'
recursive bool

Boolean indicating if files need to be fetched recursively

True

Returns:

Type Description
DataFrame

in case of type: 'buckets' a pandas dataframe with columns: bucket, creation_date and versioning

DataFrame

in case of type: 'objects' a pandas dataframe with columns: bucket, object_name, owner_name, size, content_type, metadata, last_modified, version_id

Examples:

>>> from blackbar import *
>>> x = blackbar_s3_list(type = "buckets")
>>> x = blackbar_s3_list(type = "objects", bucket = "blackbar-models", recursive = True)
>>> x = blackbar_s3_list(type = "objects", bucket = "bnosac", recursive = True)

blackbar.s3.blackbar_s3_upload(object, name, bucket='blackbar-models', metadata={}, project='default')

Upload a blackbar model on an S3 bucket (e.g. Minio/S3)

Parameters:

Name Type Description Default
object

a spacy model

required
name str

String with the name of the object

required
bucket str

Bucket where the models will be stored

'blackbar-models'
metadata dict

Dictionary with the metadata to save in the S3 object

{}
project str

String with the name of the project which will be put as tag in the S3 object

'default'

Returns:

Type Description
dict

a dict with elements: bucket, name, size, url

Examples:

>>> from blackbar import *
>>> import spacy 
>>> nlp = spacy.blank('nl') 
>>> msg = blackbar_s3_upload(nlp, name = "test", bucket = "blackbar-models")
>>> p = blackbar_model_save(nlp, type = 'zip')
>>> msg = blackbar_s3_upload(p, name = "test", bucket = "bnosac")

blackbar.s3.blackbar_s3_download(name, bucket='blackbar-models', folder=tempdir(), version_id=None)

Download a blackbar model from an S3 bucket (e.g. Minio/S3)

Parameters:

Name Type Description Default
name str

String with the name of the object

required
bucket str

Bucket where the models are stored

'blackbar-models'
folder str

Path where the model will be downloaded to

tempdir()
version_id str

Version id of the object (optional)

None

Returns:

Type Description
dict

a dict with elements: bucket, name, model_name, tags, metadata, size, folder, files

Examples:

>>> from blackbar import *
>>> import spacy 
>>> nlp = spacy.blank('nl') 
>>> info = blackbar_s3_upload(nlp, name = "test", bucket = "blackbar-models")
>>> info = blackbar_s3_upload(nlp, name = "test", bucket = "blackbar-models")
>>> info = blackbar_s3_download(name = "test", bucket = "blackbar-models")
>>> info = blackbar_s3_download(name = "test", bucket = "blackbar-models", version_id = "")
>>> info = blackbar_s3_download(name = "test", bucket = "blackbar-models", folder = tempdir())
>>> model = blackbar_model_load(info)

blackbar.s3.s3_remove(type='objects', name=None, bucket='blackbar-models', version_id=None)

Remove an object or bucket from S3

Parameters:

Name Type Description Default
type str

Either 'buckets' or 'objects'

'objects'
bucket str

Bucket name

'blackbar-models'
name str

String with the name of the object

None
version_id str

Version id of the object (optional)

None

Returns:

Type Description

Nothing, deletes the object from S3

Examples:

>>> from blackbar import *
>>> x = blackbar_s3_write({"test": "test", "abc": [123, 456]}, bucket = "bnosac", name = "testci")
>>> x = blackbar_s3_read(bucket = "bnosac", name = "testci")
>>> x
{'test': 'test', 'abc': [123, 456]}
>>> s3_remove(bucket = "bnosac", name = "testci")

blackbar.s3.s3_upload_file(object, name, bucket='blackbar-models', metadata={}, project='default')

Upload a file on an S3 bucket (e.g. Minio/S3)

Parameters:

Name Type Description Default
object

the path to a file on disk

required
name str

String with the name of the object

required
bucket str

Bucket where the models will be stored

'blackbar-models'
metadata dict

Dictionary with the metadata to save in the S3 object

{}
project str

String with the name of the project which will be put as tag in the S3 object

'default'

Returns:

Type Description
dict

a dict with elements: bucket, name, size, url

Examples:

>>> from blackbar import *
>>> import spacy 
>>> nlp = spacy.blank('nl') 
>>> p = blackbar_model_save(nlp, type = 'zip')
>>> p = p['path']
>>> msg = s3_upload_file(p, name = "test", bucket = "bnosac")

blackbar.s3.s3_download_file(name, bucket='blackbar-models', folder=tempdir(), filename=None, version_id=None)

Download a file from an S3 bucket (e.g. Minio/S3)

Parameters:

Name Type Description Default
name str

String with the name of the object

required
bucket str

Bucket where the file is stored

'blackbar-models'
folder str

Path where the file will be downloaded to

tempdir()
filename str

If provided, saves the file with this filename in the folder, otherwise, uses the name provided in name

None
version_id str

Version id of the object (optional)

None

Returns:

Type Description
dict

a dict with elements: bucket, name, tags, metadata, size, folder, files

Examples:

>>> from blackbar import *
>>> from rlike import *
>>> import spacy 
>>> nlp = spacy.blank('nl') 
>>> p = blackbar_model_save(nlp, type = 'zip')
>>> p = p['path']
>>> msg = s3_upload_file(p, name = "test", bucket = "bnosac")
>>> msg = s3_download_file(name = "test", bucket = "bnosac")
>>> msg = file_remove(file_path(msg["folder"], msg["files"]))
>>> p = savePickle("hello this is some text", "myfile.pickle")
>>> msg = s3_upload_file(p, name = "test", bucket = "bnosac")
>>> msg = s3_download_file(name = "test", bucket = "bnosac", filename = "myfile_downloaded.pickle")
>>> readPickle(file_path(msg["folder"], msg["files"]))
'hello this is some text'
>>> msg = file_remove(["myfile.pickle", file_path(msg["folder"], msg["files"])])

Save / Load blackbar models

blackbar.s3.blackbar_model_save(object, type='zip', path=tempfile(pattern='blackbar-', fileext='.zip'))

Save a blackbar model as 1 file to the harddisk

Parameters:

Name Type Description Default
object

a spacy model or other future models

required
type str

type, currently only 'zip' is supported

'zip'
path str

Path to where the file will be stored

tempfile(pattern='blackbar-', fileext='.zip')

Returns:

Type Description
dict

a dictionary with elements metadata, type, path and size

Examples:

>>> from blackbar import *
>>> from rlike import *
>>> import spacy 
>>> nlp = spacy.blank('nl') 
>>> p = blackbar_model_save(nlp, type = 'zip')
>>> done = file_remove(p['path'])
>>> p = blackbar_model_save(nlp, type = 'zip', path = 'mymodel.zip')
>>> done = file_remove(p['path'])

blackbar.s3.blackbar_model_load(object)

Load a blackbar model which was downloaded from an S3 bucket (e.g. Minio/S3)

Parameters:

Name Type Description Default
object

An object as returned by blackbar_s3_download

required

Returns:

Type Description

TODO

Examples:

>>> from blackbar import *
>>> from rlike import *
>>> import spacy 
>>> nlp = spacy.blank('nl') 
>>> info = blackbar_s3_upload(nlp, name = "test", bucket = "blackbar-models")
>>> info = blackbar_s3_download(name = "test", bucket = "blackbar-models")
>>> model = blackbar_model_load(info)

Save / Load pickled data

blackbar.s3.blackbar_s3_read(name, bucket='blackbar-models', type='pickle', version_id=None)

Read a dataset from S3

Parameters:

Name Type Description Default
bucket str

Bucket name

'blackbar-models'
name str

String with the name of the object

required
type str

Currently only 'pickle'

'pickle'
version_id str

Version id of the object (optional)

None

Returns:

Type Description

The result of reading the pickled data

Examples:

>>> from blackbar import *
>>> x = blackbar_s3_write({"test": "test", "abc": [123, 456]}, bucket = "bnosac", name = "testci")
>>> x = blackbar_s3_read(bucket = "bnosac", name = "testci")
>>> x
{'test': 'test', 'abc': [123, 456]}
>>> s3_remove(bucket = "bnosac", name = "testci")

blackbar.s3.blackbar_s3_write(data, name, bucket='blackbar-models', type='pickle')

Write a dataset to S3

Parameters:

Name Type Description Default
data Any

An object which can be pickled

required
bucket str

Bucket name

'blackbar-models'
name str

String with the name of the object

required
type str

Currently only 'pickle'

'pickle'

Returns:

Type Description

A list with elements bucket, name, size, url

Examples:

>>> from blackbar import *
>>> x = blackbar_s3_write({"test": "test", "abc": [123, 456]}, bucket = "bnosac", name = "testci")
>>> x = blackbar_s3_read(bucket = "bnosac", name = "testci")
>>> x
{'test': 'test', 'abc': [123, 456]}
>>> s3_remove(bucket = "bnosac", name = "testci")