
Saugata Bose
Lecturer | AI/ML Researcher | Python Developer
I’m a PhD‑trained AI researcher, lecturer, and founder who turns cutting‑edge ideas into production‑ready systems. My doctorate at the University of Wollongong, Australia (supervised by Dr Guoxin Su and Prof Minjie Zhang) explored short‑text classification, one‑class anomaly detection, and representation learning with deep‑transfer techniques.
I now teach a spectrum of computer‑science courses—ranging from machine learning and natural‑language processing to big‑data engineering, cloud computing, and software testing— while leading Analytics Edge Learning, where I build and deploy ML solutions using Python, PyTorch, TensorFlow, Hugging Face, AWS, and Azure. My projects include hate‑speech detectors, churn‑prediction pipelines, and Spark‑scale health analytics, all guided by rigorous research and responsible‑AI principles.
Whether mentoring students or shipping cloud‑native models, I focus on robust, explainable, and scalable AI that delivers real‑world impact.
News
- 03/2025, Appointed as a full‑time Lecturer at Canterbury Institute of Management (CIM), Australia.
- 10/2024, Began lecturing ICT712, ICT729 and ICT753 at King’s Own Institute (KOI), Sydney this semester.
- 09/2024, Paper accepted by WI‑IAT 2024.
- 09/2024, Began lecturing SDM404, CLA321 and BDA601 at Torrens University this semester.
- 08/2024, Conducting tutorial sessions for UTS 31282 starting this semester.
- 07/2024, Instructing COMP2100 & COMP8320 at MQ and CSIT115 at UOW.
- 06/2024, Operating INFO1110 bootcamp at Usyd.
- 04/2024, Accepted to ECAI‑2024 PC.
- 03/2024, Confirmed: PhD in AI/ML from University of Wollongong, Australia.
- 02/2024, Instructing COMP3210/6210 at MQ, COMP5046 at Usyd, and several units at UOW.
Education
Doctor of Philosophy
University of Wollongong, Australia
Master of Science (Computer Science)
Savitribai Phule Pune University, India
Technical Expertise
Libraries: PyTorch, PySpark, TensorFlow, Keras, Scikit‑learn, NLTK, pandas, NumPy
ML & DL: BERT, GPT, Transformers, Autoencoders, LSTM, CNN, Isolation Forest, One‑Class SVM
Big Data Tools: Hadoop, Spark, Kafka
Web Tech: PHP, MySQL, XAMPP, AWS Cloud9
Security Tools: Python cryptography, regex‑based log analyzers, Cryptography, Hashing (SHA256, MD5)
Research Highlights

Problem: Imbalanced short-text classification(ADMA 2023)
This project introduces DOCFT (Deep One-Class Fine-Tuning), a two-stage transfer learning framework tailored for short-text classification on imbalanced datasets. It employs a BERT encoder fine-tuned using a novel OC-SVM loss function combined with quantile regression. The model is trained in two alternating phases: one that adapts pre-trained features to anomaly-like samples, and another that optimizes the decision boundary using one-class principles.
Evaluated on four benchmark hate speech datasets (e.g., Stormfront, HASOC-2019, Davidson), DOCFT outperformed conventional binary classifiers by up to 9% in macro-F1 and demonstrated robust generalization across datasets. This project shows how transfer learning can be adapted for anomaly detection using one-class assumptions, even under severe data imbalance.
LexiFusedNet & LexiFuse+ (Lexicon-Fused One-Class Models) (PKAW 2023)
LexiFusedNet and its enhanced version LexiFuse+ are unified models tackling short-text classification on imbalanced data by fusing lexical knowledge with deep learning. These models leverage a hate speech lexicon (HateBase) to weight input features, combined with BERT transfer learning and one-class classification. LexiFuse+ introduces semi-supervised training using unlabeled data and a custom one-class loss function for fine-tuning.
This novel integration of lexicon-based features and one-class transfer learning achieved state-of-the-art results in hate speech detection, with LexiFuse+ yielding high macro-F1 scores (e.g., 0.88 on Davidson and Stormfront datasets) – a 2–6% improvement over models without lexicon fusion. These projects demonstrate how linguistic domain knowledge (lexicons) can be combined with deep networks to improve detection of toxic content under extreme class imbalance.

Selected Publications
- VarietyDetect (WI-IAT 2024)| full paper: Introduced an adaptive one-class anomaly detection framework combining self-training and transfer learning.
- DOCFT (ADMA 2023)| full paper: Developed a Deep One-Class Fine-Tuning approach with quantile regression for handling imbalanced short-text datasets.
- LexiFusedNet (PKAW 2023)| full paper: Proposed fusion of lexicon-driven features with BERT-based transfer learning for interpretable classification.
- LexiFuse+ (eScience 2023)| full paper: Semi-supervised short-text classification using OCSVM and unlabeled data leveraging BERT-based architecture.
- Deep One-Class Hate Speech Detection (LREC 2022)| full paper: A new paradigm to train hate speech detectors using only hate class data, improving generalization across datasets.
Technical Project Highlights
Big Data COVID-19 Analysis
Apache Spark with PySpark to analyze large-scale COVID-19 datasets using MLlib. Techniques: distributed aggregation, logistic regression, KMeans.
Customer Churn Prediction
ML pipeline using pandas, sklearn, XGBoost. Feature engineering, explainability with SHAP.
Security Log Analysis & Encryption Practice
Regex log parsing, error tracking, encryption via Fernet, hash cracking demo.
TF-IDF Based JavaScript Tracker Detection
TF-IDF feature extraction with SVC, One-Class SVM, achieving 99.7% AUC.
Short Text Classification with BERT + One-Class SVM
Combined BERT embeddings with OC-SVM for hate speech classification on imbalanced datasets.
Awards
- 2018-2023, International Postgraduate Tuition Awards (IPTA) Scholarship from UOW, Australia.
- 2018-2022, University Postgraduate Award (UPA) Scholarship from UOW, Australia.
- 2012-2014, The Indian Council for Cultural Relations (ICCR) Fellowship from Government of India.
Professional Experience
Lecturer – Canterbury Institute of Management (CIM) (2025 – Present)
- Lead and co-ordinate units: MBIS401, MBIS402 and MBIS403.
- Redesigned curricula to embed AWS SageMaker labs and secure SDLC practices.
- Supervise capstone projects applying deep-learning models to real client data.
Founder & AI/ML Consultant – Analytics Edge Learning (2023 – Present)
- Designed and deployed sentiment‑analysis & churn‑prediction pipelines (PySpark, Scikit‑learn, AWS).
- Built NLP‑based document‑clustering solutions for e‑commerce clients.
- Advised on cloud architecture, ML Ops, and data‑governance best practices.
- Community Outreach:
- Mentor, “Code for Good” (Ryde Baptist Church) – guiding high‑school students in coding & digital literacy.
- Facilitator, “Learn the Technology” – workshops helping seniors adopt modern digital tools.
Sessional & Part‑Time Academic (2019 – 2025)
- King’s Own Institute: ICT729, ICT712, ICT753.
- Torrens University: CLA321, SDM404, BDA601.
- UTS: 31282 – Systems Testing, 32516 – Internet Programming.
- University of Sydney: COMP5046 (NLP), INFO5992, INFO1110.
- Macquarie University: COMP3210/6210, COMP8320, COMP2100, COMP8325, WCOM1300.
- University of Wollongong: CSIT111, CSIT114/814, CSIT115, CSIT985, CSCI926/426, CSCI318, CSIT970.
Other Professional Activities
- Program Committee Member – ECAI 2024, PRICAI 2024.
- Reviewer – LREC, ADMA.
- Graduate Member – ACS | Member – IEEE, ELRA.
No. Visitor Since Feb 2023. Powered by w3.css