============================================================================ 
ACL 2017 Reviews for Submission #440
============================================================================ 

Title: A* CCG Parsing with a Supertag and Dependency Factored Model

Authors: Masashi Yoshikawa, Hiroshi Noji and Yuji Matsumoto
============================================================================
                            REVIEWER #1
============================================================================ 


---------------------------------------------------------------------------
Reviewer's Scores
---------------------------------------------------------------------------

                         APPROPRIATENESS: 5
                                 CLARITY: 4
                             ORIGINALITY: 4
       EMPIRICAL SOUNDNESS / CORRECTNESS: 4
     THEORETICAL SOUNDNESS / CORRECTNESS: 4
                   MEANINGFUL COMPARISON: 4
                               SUBSTANCE: 4
              IMPACT OF IDEAS OR RESULTS: 4
         IMPACT OF ACCOMPANYING SOFTWARE: 3
          IMPACT OF ACCOMPANYING DATASET: 1
                          RECOMMENDATION: 4
                          REVIEW DATASET: Yes


---------------------------------------------------------------------------
Comments
---------------------------------------------------------------------------

This paper describes a state-of-the-art CCG parsing model that decomposes into
tagging and dependency scores, and has an efficient A* decoding algorithm.
Interestingly, the paper slightly outperforms Lee et al. (2016)'s more
expressive global parsing model, presumably because this factorization makes
learning easier. It's great that they also report results on another language,
showing large improvements over existing work on Japanese CCG parsing. One
surprising original result is that modeling the first word of a constituent as
the head substantially outperforms linguistically motivated head rules. 

Overall this is a good paper that makes a nice contribution. I only have a few
suggestions:
- I liked the way that the dependency and supertagging models interact, but it
would be good to include baseline results for simpler variations (e.g. not
conditioning the tag on the head dependency).
- The paper achieves new state-of-the-art results on Japanese by a large
margin. However, there has been a lot less work on this data - would it also be
possible to train the Lee et al. parser on this data for comparison?
- Lewis, He and Zettlemoyer (2015) explore combined dependency and supertagging
models for CCG and SRL, and may be worth citing.

============================================================================
                            REVIEWER #2
============================================================================ 


---------------------------------------------------------------------------
Reviewer's Scores
---------------------------------------------------------------------------

                         APPROPRIATENESS: 5
                                 CLARITY: 4
                             ORIGINALITY: 3
       EMPIRICAL SOUNDNESS / CORRECTNESS: 4
     THEORETICAL SOUNDNESS / CORRECTNESS: 5
                   MEANINGFUL COMPARISON: 5
                               SUBSTANCE: 4
              IMPACT OF IDEAS OR RESULTS: 4
         IMPACT OF ACCOMPANYING SOFTWARE: 4
          IMPACT OF ACCOMPANYING DATASET: 1
                          RECOMMENDATION: 4
                          REVIEW DATASET: Yes


---------------------------------------------------------------------------
Comments
---------------------------------------------------------------------------

- Strengths:
This paper presents an extension to A* CCG parsing to include dependency
information.  Achieving this while maintaining speed and tractability is a very
impressive feature of this approach.  The ability to precompute attachments is
a nice trick.                  I also really appreciated the evaluation of the
effect of
the
head-rules on normal-form violations and would love to see more details on the
remaining cases.

- Weaknesses:
I'd like to see more analysis of certain dependency structures.  I'm
particularly interested in how coordination and relative clauses are handled
when the predicate argument structure of CCG is at odds with the dependency
structures normally used by other dependency parsers.

- General Discussion:
I'm very happy with this work and feel it's a very nice contribution to the
literature.  The only thing missing for me is a more in-depth analysis of the
types of constructions which saw the most improvement (English and Japanese)
and a discussion (mentioned above) reconciling Pred-Arg dependencies with those
of other parsers.

============================================================================
                            REVIEWER #3
============================================================================ 


---------------------------------------------------------------------------
Reviewer's Scores
---------------------------------------------------------------------------

                         APPROPRIATENESS: 5
                                 CLARITY: 4
                             ORIGINALITY: 2
       EMPIRICAL SOUNDNESS / CORRECTNESS: 3
     THEORETICAL SOUNDNESS / CORRECTNESS: 4
                   MEANINGFUL COMPARISON: 5
                               SUBSTANCE: 4
              IMPACT OF IDEAS OR RESULTS: 3
         IMPACT OF ACCOMPANYING SOFTWARE: 1
          IMPACT OF ACCOMPANYING DATASET: 1
                          RECOMMENDATION: 4
                          REVIEW DATASET: No


---------------------------------------------------------------------------
Comments
---------------------------------------------------------------------------

This paper propose a CCG parsing model that brings together recent work on
supertagging-based CCG parsing and dependency parsing. Combining both
approaches in a single model achieves a modest gain over the existing state of
the art for English. Much larger gains are observed for Japanese with a simple
adaptation for converting CCG to dependency trees, 

While the model is a straightforward adaptation of previous work, achieving
good performance with a simplified dependency conversion rule and the exclusion
of normal form constraints is a surprising and interesting result. Generalizing
recent parsing improvements to other languages is also an important
contribution.

There are number of missing experiments that would verify the modeling
contribution of this paper, which is the multitask dependency/supertagging
setup and the use of the most probably head to predict the supertag. These
contributions warrant additional ablations and comparisons: (1) What if a
simple MLP over the hidden vector is used to predict the supertag instead? (2)
What if the gold head is used at training time instead of the most probable
head? (3) What if the two tasks were trained separately? Does the parameter
sharing matter? (4) What if the architecture is reversed, and the most probably
tag is used to predicting the head?

Another natural question that is perhaps out of scope is the impact of modeling
tree-structured dependencies rather than the graph-structured CCG dependencies
proposed in Lewis et al. (2016).