Re-Structuring CLIP's Language Capabilities
Vision-language models (VLM) like CLIP have transformed how we approach image classification. The performance of these models is heavily influenced by subtle choices such as pr...
Vision-language models (VLM) like CLIP have transformed how we approach image classification. The performance of these models is heavily influenced by subtle choices such as pr...
While impressive examples of AI-generated art and dialogue have captured the public’s attention in recent years, one of the most fundamental data formats–tabular data–still lack...
Large pretrained models like GPT-4, Gemini, and Claude 3 are fantastic at labeling data—-whether it’s spam detection in YouTube comments or classifying topics in medical documen...
Efficient LLM alignment without the data and compute expense of traditional methods.
Exploring how overlap density drives weak-to-strong generalization and its applications in data source selection.
OTTER offers a tuning-free, inference-time label distribution adaptation of zero-shot models by leveraging optimal transport.
Effortlessly robustify CLIP-based models to handle spurious currelations-- no xtra data, no xtra training!