Guardado en:
| Autores principales: | Huang, Chen, Seto, Skyler, Abnar, Samira, Grangier, David, Jaitly, Navdeep, Susskind, Josh |
|---|---|
| Formato: | Preprint |
| Publicado: |
2024
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2410.23698 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Matryoshka Diffusion Models
por: Gu, Jiatao, et al.
Publicado: (2023)
por: Gu, Jiatao, et al.
Publicado: (2023)
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation
por: Gu, Jiatao, et al.
Publicado: (2024)
por: Gu, Jiatao, et al.
Publicado: (2024)
Improving GFlowNets for Text-to-Image Diffusion Alignment
por: Zhang, Dinghuai, et al.
Publicado: (2024)
por: Zhang, Dinghuai, et al.
Publicado: (2024)
Normalizing Flows are Capable Generative Models
por: Zhai, Shuangfei, et al.
Publicado: (2024)
por: Zhai, Shuangfei, et al.
Publicado: (2024)
Proxy-FDA: Proxy-based Feature Distribution Alignment for Fine-tuning Vision Foundation Models without Forgetting
por: Huang, Chen, et al.
Publicado: (2025)
por: Huang, Chen, et al.
Publicado: (2025)
How PARTs assemble into wholes: Learning the relative composition of images
por: Ayoughi, Melika, et al.
Publicado: (2025)
por: Ayoughi, Melika, et al.
Publicado: (2025)
Text-Conditional JEPA for Learning Semantically Rich Visual Representations
por: Huang, Chen, et al.
Publicado: (2026)
por: Huang, Chen, et al.
Publicado: (2026)
TypeScore: A Text Fidelity Metric for Text-to-Image Generative Models
por: Sampaio, Georgia Gabriela, et al.
Publicado: (2024)
por: Sampaio, Georgia Gabriela, et al.
Publicado: (2024)
HyperCLIP: Adapting Vision-Language models with Hypernetworks
por: Akinwande, Victor, et al.
Publicado: (2024)
por: Akinwande, Victor, et al.
Publicado: (2024)
Normalizing Trajectory Models
por: Gu, Jiatao, et al.
Publicado: (2026)
por: Gu, Jiatao, et al.
Publicado: (2026)
How Far Are We from Intelligent Visual Deductive Reasoning?
por: Zhang, Yizhe, et al.
Publicado: (2024)
por: Zhang, Yizhe, et al.
Publicado: (2024)
STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flows
por: Gu, Jiatao, et al.
Publicado: (2025)
por: Gu, Jiatao, et al.
Publicado: (2025)
Rethinking JEPA: Compute-Efficient Video SSL with Frozen Teachers
por: Li, Xianhang, et al.
Publicado: (2025)
por: Li, Xianhang, et al.
Publicado: (2025)
Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling
por: Gu, Jiatao, et al.
Publicado: (2024)
por: Gu, Jiatao, et al.
Publicado: (2024)
Adapting to Distribution Shift by Visual Domain Prompt Generation
por: Chi, Zhixiang, et al.
Publicado: (2024)
por: Chi, Zhixiang, et al.
Publicado: (2024)
Enhancing CLIP with CLIP: Exploring Pseudolabeling for Limited-Label Prompt Tuning
por: Menghini, Cristina, et al.
Publicado: (2023)
por: Menghini, Cristina, et al.
Publicado: (2023)
Better than Average: Spatially-Aware Aggregation of Segmentation Uncertainty Improves Downstream Performance
por: Guarino, Vanessa Emanuela, et al.
Publicado: (2026)
por: Guarino, Vanessa Emanuela, et al.
Publicado: (2026)
The Coupling Within: Flow Matching via Distilled Normalizing Flows
por: Berthelot, David, et al.
Publicado: (2026)
por: Berthelot, David, et al.
Publicado: (2026)
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
por: Grangier, David, et al.
Publicado: (2024)
por: Grangier, David, et al.
Publicado: (2024)
Learning to Adapt Frozen CLIP for Few-Shot Test-Time Domain Adaptation
por: Chi, Zhixiang, et al.
Publicado: (2025)
por: Chi, Zhixiang, et al.
Publicado: (2025)
MoP-CLIP: A Mixture of Prompt-Tuned CLIP Models for Domain Incremental Learning
por: Nicolas, Julien, et al.
Publicado: (2023)
por: Nicolas, Julien, et al.
Publicado: (2023)
AREA: Attribute Extraction and Aggregation for CLIP-Based Class-Incremental Learning
por: Xie, Zhen-Hao, et al.
Publicado: (2026)
por: Xie, Zhen-Hao, et al.
Publicado: (2026)
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
por: Maini, Pratyush, et al.
Publicado: (2024)
por: Maini, Pratyush, et al.
Publicado: (2024)
Understanding Model Reprogramming for CLIP via Decoupling Visual Prompts
por: Cai, Chengyi, et al.
Publicado: (2025)
por: Cai, Chengyi, et al.
Publicado: (2025)
Visual Modality Prompt for Adapting Vision-Language Object Detectors
por: Medeiros, Heitor R., et al.
Publicado: (2024)
por: Medeiros, Heitor R., et al.
Publicado: (2024)
Overcoming the Pitfalls of Vision-Language Model Finetuning for OOD Generalization
por: Zang, Yuhang, et al.
Publicado: (2024)
por: Zang, Yuhang, et al.
Publicado: (2024)
STARFlow2: Bridging Language Models and Normalizing Flows for Unified Multimodal Generation
por: Shen, Ying, et al.
Publicado: (2026)
por: Shen, Ying, et al.
Publicado: (2026)
Learning to Rank Pre-trained Vision-Language Models for Downstream Tasks
por: Ding, Yuhe, et al.
Publicado: (2024)
por: Ding, Yuhe, et al.
Publicado: (2024)
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models
por: Abnar, Samira, et al.
Publicado: (2025)
por: Abnar, Samira, et al.
Publicado: (2025)
Federated Domain Generalization via Prompt Learning and Aggregation
por: Gong, Shuai, et al.
Publicado: (2024)
por: Gong, Shuai, et al.
Publicado: (2024)
SimpleFold: Folding Proteins is Simpler than You Think
por: Wang, Yuyang, et al.
Publicado: (2025)
por: Wang, Yuyang, et al.
Publicado: (2025)
Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
por: Ablin, Pierre, et al.
Publicado: (2025)
por: Ablin, Pierre, et al.
Publicado: (2025)
Learning Generalizable Prompt for CLIP with Class Similarity Knowledge
por: Jung, Sehun, et al.
Publicado: (2025)
por: Jung, Sehun, et al.
Publicado: (2025)
Transitive Vision-Language Prompt Learning for Domain Generalization
por: Wang, Liyuan, et al.
Publicado: (2024)
por: Wang, Liyuan, et al.
Publicado: (2024)
MIP: CLIP-based Image Reconstruction from PEFT Gradients
por: Zhou, Peiheng, et al.
Publicado: (2024)
por: Zhou, Peiheng, et al.
Publicado: (2024)
CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling
por: Wang, Xinze, et al.
Publicado: (2025)
por: Wang, Xinze, et al.
Publicado: (2025)
Detecting AI-Generated Images via CLIP
por: Moskowitz, A. G., et al.
Publicado: (2024)
por: Moskowitz, A. G., et al.
Publicado: (2024)
Stable Diffusion Dataset Generation for Downstream Classification Tasks
por: Lomurno, Eugenio, et al.
Publicado: (2024)
por: Lomurno, Eugenio, et al.
Publicado: (2024)
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
por: Gu, Jiatao, et al.
Publicado: (2025)
por: Gu, Jiatao, et al.
Publicado: (2025)
Breaking the Limits of Open-Weight CLIP: An Optimization Framework for Self-supervised Fine-tuning of CLIP
por: Mehta, Anant, et al.
Publicado: (2026)
por: Mehta, Anant, et al.
Publicado: (2026)
Ejemplares similares
-
Matryoshka Diffusion Models
por: Gu, Jiatao, et al.
Publicado: (2023) -
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation
por: Gu, Jiatao, et al.
Publicado: (2024) -
Improving GFlowNets for Text-to-Image Diffusion Alignment
por: Zhang, Dinghuai, et al.
Publicado: (2024) -
Normalizing Flows are Capable Generative Models
por: Zhai, Shuangfei, et al.
Publicado: (2024) -
Proxy-FDA: Proxy-based Feature Distribution Alignment for Fine-tuning Vision Foundation Models without Forgetting
por: Huang, Chen, et al.
Publicado: (2025)