Welcome! I am Tony, a final-year Systems and Electronics Engineering student at Universidad de los Andes and a researcher at the Center of Research and Formation in Artificial Intelligence (CinfonIA). Currently, I am a research intern at Cornell University and recently started to collaborate on research with a professor from the University of Illinois Chicago (UIC). My experience centers on NLP and GenAI, with a strong focus on developing reasoning capabilities for multimodal agents (image/video) to address complex real-world tasks and understand inherent semantics. My ongoing research and recent publications support this passion.
📄 Publications
Semantic Shift Detection for 19th Century Spanish
Tony Montes, Laura Manrique-Gómez, Rubén Manrique
- Implemented a pipeline for detection of semantic shifts (changes in meaning) between two datasets in the same language, employing contextual word embeddings with BERT (fine-tuned on the historical dataset) and KMeans clustering.
- Tested the solution with the LSCDiscovery Binary Change Detection task, among different BERT-like models and clustering algorithms, and checked it’s performance over 250 historic words.
- Compiled the biggest 19th-century Spanish dataset with ~180M tokens, within 3 different sources: Project Gutenberg, The British Library Books, and LatamXIX.
19th Century Latin American Spanish Newspaper Corpus with LLM OCR Correction
Laura Manrique-Gómez, Tony Montes, Arturo Rodríguez-Herrera, Rubén Manrique
- Developed a semi-automated methodology for correction of OCR errors and detection of surface forms in historical datasets, employing LLMs and a diff algorithm.
- Applied the methodology to LatamXIX, a 19th-century Latin American Spanish newspaper corpus developed by Laura Manrique-Gómez, achieving the expected results checked by experts.
📖 Education
- 2020.01 - 2024.12, Universidad de los Andes, BS, Systems and Computing Engineering
- 2020.08 - 2024.12, Universidad de los Andes, BS, Electronics Engineering
🏢 Research Experience
- Research Intern, Cornell University
- Researcher at the ECE department,advised by Prof. Zhiru Zhang, employing language models for 3D-asset and image compression and controlled diffusion models for decompression with semantics and canny edge maps
- External Researcher, University of Illinois Chicago
- Researcher advised by Prof. Moontae Lee, working on an agent Web-RAG-based solution for code assistance on completion, fixing and solving tasks
- Research Assistant, Universidad de los Andes
- Researcher at the Center of Research and Formation in Artificial Intelligence (CinfonIA) in the Historical Ink project, led by Prof. Rubén Manrique
- BS Student, Universidad de los Andes
- Systems and Computing Engineering thesis: “Semantic Shift Detection for 19th Century Spanish”, advised by Prof. Rubén Manrique
- Electronics Engineering thesis: “Zero-Shot Video Question Answering via Agent with Open-Vocabulary Grounding Validation”, advised by Prof. Fernando Lozano