NL for SE @ AAAI 2018

Overview

The proliferation of open-source projects has led to large amounts of source code and related artifacts: arguably, the rich and open resources associated with software--including open source repositories, Q/A sites, change histories, and communications between developers--are the richest and most detailed information resource for any technical area. Recently it has been discovered that “natural”, human-produced software has many interesting statistical regularities. As a consequence code corpora, just like natural language corpora, are amenable to statistical modeling, and a number of software tasks such as coding, testing, porting, bug-patching etc are potentially enhanced by the use of these statistical models.

This interdisciplinary workshop will explore issues related to the statistical modeling of software corpora, including topics such as: modeling repetitiveness in source code; use of language models for the code suggestion in IDEs; using probabilistic grammars to mine programming idioms; statistical methods for type inference in a dynamically typed languages; statistical machine translation for porting applications between programming languages, or “mini-fying”Javascript; using statistical language models to find bugs; or statistical methods for automatic code patching, code summarization, code retrieval, code annotation, or test generation.

The workshop follows several earlier workshops on this topic at Microsoft Research, Dagstuhl event, and SIGSOFT FSE.

We are delighted that the workshop will feature two invited speakers: Graham Neubig , of Carnegie-Mellon University, and Danny Tarlow , of Google Brain.

Funding

We gratefully acknowledge funding, from NSF, to support a limited number of US travellers to the workshop, especially students and members of under-represented groups, and researchers that might not normally attend AAAI.

Call for participation

We invite you to join us in New Orleans, we have a great schedule of two keynote presentations, and a collection of presentations showcasing teh latest work in this area.

Schedule

Program overview

Feb 2, 2018

8:30am – 8:45am

Welcome

8:45am – 10:30am

First Session

Chair Prem Devanbu, UC Davis

- Keynote Talk, Danny Tarlow, Google Brain (Title : "Why Deep Nets are Probably the Best Choice for Modeling Software" )

- Natural Language Processing and Program Analysis For Supporting Todo Comments As Software Evolves (Long)
- Using Natural Language Processing for Documentation Assist (Long)

10:30am – 11am
Coffee Break

11am – 12:30pm

Paper Presentation

Session 1

-NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System (Long)
-Generating Regular Expressions from Natural Language Specifications: Are We There Yet? (Long)
-Studying the Differences Between Natural and Programming Languages (Long)
Automated refactoring of object-oriented code using clustering ensembles (Short)
Improving the quality of Clone Detection with Conceptual Similarity of Source code. (Short)
Towards J.A.R.V.I.S. for Software Engineering: Lessons Learned Implementing a Natural Language Chat Interface (short)

12:30am – 2pm
Lunch, on your own

2:00 PM – 3:30pm

Session 2:

- Can we Learn Type Inference (Long)
- Statistical Machine Translation Is a Natural Fit for Automatic Identifier Renaming in Software Source Code (Long)
Evaluation of Type Inference with Textual Cues (Long)
Cross-Language Learning for Program Classification using Bilateral Tree-Based Convolutional Neural Networks (Short)
Extracting information types from Android layout code using sequence to sequence learning (Short)
Towards Traceability Link Recovery for Self-Adaptive Systems (Short)

3:30pm – 4:00pm
Coffee Break

4:00pm – 6:00pm

Keynote 2 Graham Neubig (Title : Program Synthesis and Description with Structured Machine Learning Models )
Closing Remarks/Discussion

Important dates

October 16, 2017

Workshop Submissions Due (AOE time)

November 9, 2017

Notifications Sent to Authors

November 21, 2017

Final Workshop Papers Due at AAAI

Program Committee

Program Chairs

Prem Devanbu University of California, Davis

William Cohen Carnegie-Mellon University

Program Committee

Earl Barr University College, London

Jacob Devlin Google

Doug Downey Northwestern University

Aditya Kanade Indian Institute of Science

Ray Mooney UT Austin

Graham Neubig Carnegie-Mellon University

Tien Nguyen UT Dallas

Dennis Poshyvanyk William & Mary

Charles Sutton University of Edinburgh

Bogdan Vasilescu Carnegie-Mellon University

Martin Vechev ETH, Zurich

Contact
For questions or comments about the workshop, please contact William Cohen or Prem Devanbu.

Program Chairs
Prem Devanbu	University of California, Davis
William Cohen	Carnegie-Mellon University
Program Committee
Earl Barr	University College, London
Jacob Devlin	Google
Doug Downey	Northwestern University
Aditya Kanade	Indian Institute of Science
Ray Mooney	UT Austin
Graham Neubig	Carnegie-Mellon University
Tien Nguyen	UT Dallas
Dennis Poshyvanyk	William & Mary
Charles Sutton	University of Edinburgh
Bogdan Vasilescu	Carnegie-Mellon University
Martin Vechev	ETH, Zurich