Architecture
Project blackbar contains tooling allowing to anonymize and pseudonymize texts
- Push text data to annotate to Inception
- Extract annotated texts from Inception
- Build models which identify entities in order to anonymize text records
- Automated anonymization of text records
- Functionalities for pseudonymization of anonymized texts records
- Automated pseudonymization of anonymized text records
- Apps for validation, checks, model building and automation
- AI Apps on top of the pseudonymized texts
High level infrastructure
- Database with texts: IRIS or an ANSI SQL database
- Inception for annotation & generating training data
- NLP models (spacy / pytorch / transformers), alignment: Smith-Waterman
- Automation: Prefect
- Model storage backend: Minio
- Docker/Podman-based tooling
- Python packages
- Webapps behind ShinyProxy
The project consists of the following repositories
- blackbar with source code at https://github.com/bnosac/blackbar
- blackbar-py with source code at https://github.com/bnosac/blackbar-py
- blackbar-docker
- textalignment with source code at https://github.com/bnosac/textalignment
- rlike with source code at https://github.com/bnosac/rlike
The documentation of these are public, to have access to the source code, contact us here.