Automated Document Classification and Management for School Accreditation
Utilizes fine-tuned transformer-based Natural Language Processing: BERT Classifier and Named Entity Recognition, to classify over 279 document classifications and manage them to over 2000 folders. Models were trained using a synthetic dataset with over 33,000 samples. Combining the two models with TF-IDF keyword analysis brought the overall system architecture accuracy to around 96% with over 97% in precision. Uses React and HTML5 for frontend, and Python, transformer/HuggingFace and pytesseract for the backend. Web migration was handled using SupaBase/PostgreSQL as database, and CloudFlare tunneling.
Download CV