ICLR 2024

Conferencia de referencia en aprendizaje de representaciones. Ronda única, ciclo de revisión transparente.

OpenReview

7404

Envíos visibles

Total expuesto en OpenReview

39,1 %

Tasa de aceptación

3889

Críticas analizadas

Sobre 300 envíos rechazados

Patrones detectados

Distribución de puntuaciones

Puntuaciones que dieron los revisores en sus revisiones.

Decisiones del comité

Cómo se reparten las decisiones finales sobre los envíos visibles.

Reject60,9 %
Accept (poster)31,3 %
Accept (spotlight)6,3 %
Accept (oral)1,5 %

Mapa de debilidades

Cada barra es un patrón recurrente que aparece en las revisiones de papers rechazados. La anchura representa cuánto pesa ese patrón sobre el total de críticas analizadas.

Definiciones y algoritmos poco claros25,3 %860
Datasets y configuración de entrenamiento insuficientes19,7 %672
Falta de contribución clara12,1 %411
Comparación frente a baselines9,1 %309
Posicionamiento frente al trabajo existente7,3 %248
Resultados experimentales superficiales7,4 %253
Sección de related work mal estructurada6,5 %221
Errores en secciones y referencias internas6,3 %215
Figuras y captions confusas3,8 %130
Aplicabilidad al mundo real cuestionada2,5 %84

Patrones, uno a uno

Ordenados por peso. Para cada patrón mostramos qué representa, cómo lo formulan los revisores y un aprendizaje práctico que puedes aplicar antes de enviar tu próximo paper.

#01

Definiciones y algoritmos poco claros

22,1 %del total860 items

Los revisores no consiguen reconstruir lo que el paper propone a partir de su descripción formal: notación incompleta, supuestos sin enunciar, ecuaciones que no cuadran con el texto.

authorsequationtheoremalgorithmeqcleardefinedSection

Cómo lo dicen los revisores

One of the biggest weaknesses of the paper is that it does not properly place its results in the context of the existing literature. Once the indicator function w.r.t. the threshold has been defined the proof relies on existing techniques (...).

It is unclear from the description in Section 3 whether the network outputs are normalised to a probability simplex or treated as logits. The same symbol denotes both objects in different equations.

The proof of Theorem 2 invokes Lemma 1 with a slightly different set of assumptions; the authors should explicitly state which version they are using.

Aprendizaje práctico. Antes de enviar, pídele a alguien ajeno al proyecto que lea solo Sección 3 (definiciones) y reproduzca el algoritmo. Si no puede, hay que reescribir.

#02

Datasets y configuración de entrenamiento insuficientes

17,3 %del total672 items

Las decisiones de modelado y entrenamiento se justifican poco: solo se evalúa en uno o dos datasets, los hiperparámetros no aparecen, la elección de modelo base no se discute.

modeldatamodelsdatasettrainingperformancedifferentdatasets

Cómo lo dicen los revisores

Despite the promising idea, the authors do not convincingly and clearly show the value of the proposed metrics and the goal of the uncertainty analysis. First, the meaning of the metrics is not explained and must be found in the literature.

All experiments use the same backbone with default hyperparameters. The paper would be much stronger with a sensitivity study showing the method is not brittle to specific settings.

Only two datasets are evaluated; both are saturated benchmarks. A third, harder dataset would strengthen the case considerably.

Aprendizaje práctico. Aporta al menos tres datasets diversos. Reporta hiperparámetros completos en apéndice. Justifica explícitamente el modelo base elegido.

#03

Falta de contribución clara

10,6 %del total411 items

El paper resulta interesante pero el revisor no es capaz de articular qué problema resuelve que no estuviera ya resuelto, o qué supuesto critica.

papermaincontributionlacksdoesauthorsprovideinteresting

Cómo lo dicen los revisores

Despite the above strengths, this paper also has some drawbacks: I'm not sure I see a clear, single-sentence answer to what the main contribution is over the work cited as [4].

The paper has many interesting ideas, but the central claim is hidden across multiple sections. Pulling it into a single named contribution at the end of the introduction would help.

It is not easy to extract from the discussion which assumption of the prior work the authors are challenging.

Aprendizaje práctico. Pasa el test del párrafo: ¿puedes describir tu contribución en una frase sin usar la palabra `nuevo`? Si no, no está clara.

#04

Comparación frente a baselines

8 %del total309 items

Los baselines son escasos o no son los más fuertes. Falta comparar con la familia de métodos más relevante para el problema.

methodproposedproposed methodpaperbaselinesperformanceauthorsdoes

Cómo lo dicen los revisores

It is not easy to see in principle how the proposed method is superior to the probabilistic method.

The authors compare against three relatively old baselines. Several stronger 2023 methods are missing entirely.

Why was the comparison restricted to the single-modal setting? The natural baseline in this sub-area is multi-modal.

Aprendizaje práctico. Lista los tres baselines más fuertes en el sub-área antes de redactar la tabla. Si tu método no los gana, hazlo explícito y argumenta por qué tu enfoque es mejor por otra dimensión.

#05

Posicionamiento frente al trabajo existente

6,4 %del total248 items

El paper se presenta como `unified framework` o `general approach` pero los revisores ven que solo cubre un sub-área concreta de la literatura. La sección de related work es incompleta.

methodsexistingcomparisonexisting methodsmethodbaselinesauthorspaper

Cómo lo dicen los revisores

Though the authors claim that they aim to propose a unified framework, the methods considered in their paper are mainly based on AM and POMO, in other words, the auto-regressive methods. As far as I know, there are also other methods.

The paper claims to apply broadly but only validates on one task family. Either evaluate on a second family or restrict the framing.

Several recent works on this exact problem are not cited; the contribution would look different in their light.

Aprendizaje práctico. Si afirmas universalidad, demuéstralo: incluye al menos un experimento fuera del nicho original. Si no, modera el lenguaje.

#06

Resultados experimentales superficiales

6,5 %del total253 items

Las tablas existen pero falta análisis: sin error bars, sin tests de significancia, sin visualizaciones que cuenten la historia.

resultsexperimentalexperimental resultspaperauthorsexperimentsempiricaltable

Cómo lo dicen los revisores

Visualization results are encouraged to be included.

Table 2 reports point estimates only. Without error bars it is impossible to tell whether the differences are significant.

The empirical evaluation is described in two paragraphs but the underlying methodology — number of seeds, splits, evaluation protocol — is not clear.

Aprendizaje práctico. Acompaña cada tabla con al menos una figura que ilustre la diferencia clave. Reporta intervalos de confianza, no solo medias.

#07

Sección de related work mal estructurada

5,7 %del total221 items

La sección 2 lista trabajos previos pero no los compara entre sí ni los conecta con la contribución del paper. Mero `wall of citations`.

workrelatedrelated workworksrelated worksnoveltyauthorsprevious

Cómo lo dicen los revisores

The related work section needs reorganization. The current version simply decomposes the framework into related parts and introduces the works one by one. It's hard to grasp the major contribution.

Many cited works are summarised but not contrasted. Add a one-sentence comparison per cluster of related work.

The novelty argument depends on a comparison the section does not make explicit.

Aprendizaje práctico. Estructura related work por dimensiones (no por papers). Cada subapartado debe terminar con `dejando un hueco que cubrimos con X`.

#08

Errores en secciones y referencias internas

5,5 %del total215 items

Pequeños errores que rompen la lectura: ecuaciones con índices intercambiados, referencias internas a secciones equivocadas, abreviaturas no definidas.

sectionauthorsusedparagraphpapersection authorsclarityexperiments

Cómo lo dicen los revisores

Section 4.2. r^{i}|wrap(I^{i-1}, m^{i->i-1})-I^{i}| should be r^{i}|wrap(I^{i-1}, m^{i-1>i})-I^{i}| ?

many typos, e.g. section 5, `We also e introduce`, what is the e for?

Eq. (8) references Lemma 3 but the lemma is numbered 4 in the appendix.

Aprendizaje práctico. Una pasada de proof-reading dedicada a `consistencia` (no a `gramática`) caza la mayoría de estos. Hazla en los días previos al deadline, no en mitad del flujo.

#09

Figuras y captions confusas

3,3 %del total130 items

Demasiadas figuras, caption inadecuados, falta de unidades, leyendas ilegibles a tamaño impreso. Cuesta más trabajo del necesario seguir el argumento visual.

figurefigurescaptionresultsfigure figureminortextunclear

Cómo lo dicen los revisores

I recommend the author combine Figure 3 and Figure 4 into one line such that a lot of space can be saved.

The legend in Fig. 5 is unreadable at the printed size; use thicker lines and contrasting markers.

Captions assume the reader has the body text in front of them. Make them self-contained.

Aprendizaje práctico. Cada figura debe poder leerse en aislado. Caption auto-contenido. Si Fig. 3 y Fig. 4 dicen lo mismo, fusiona.

#10

Aplicabilidad al mundo real cuestionada

2,2 %del total84 items

El método se evalúa en escenarios sintéticos limpios. Los revisores piden ver cómo se comporta cuando los datos son ruidosos, hay distribution shift, o hay un sistema más allá del pipeline.

realreal worldworldapplicationsscenariosworld applicationsworld scenariosdata

Cómo lo dicen los revisores

If the current setting is realistic, I suggest showing the effectiveness of attacking the real-world system.

Synthetic results are convincing; a single real-world deployment would change the contribution from theoretical to practical.

How does the method degrade under realistic noise levels? Reporting only the clean setting limits the impact.

Aprendizaje práctico. Si tu método es viable en producción, demuéstralo con un experimento en datos sucios. Si no, declara honestamente el alcance: investigación básica.

Otros venues

ICLR 2025

11.672 envíos · 10 clusters

→

NeurIPS 2024

4236 envíos · 10 clusters

→

TMLR

6661 envíos · 10 clusters

→