The proliferation of open-source projects has led to large amounts of source code and related artifacts: arguably, the rich and open resources associated with software--including open source repositories, Q/A sites, change histories, and communications between developers--are the richest and most detailed information resource for any technical area. Recently it has been discovered that “natural”, human-produced software has many interesting statistical regularities. As a consequence code corpora, just like natural language corpora, are amenable to statistical modeling, and a number of software tasks such as coding, testing, porting, bug-patching etc are potentially enhanced by the use of these statistical models.

This interdisciplinary workshop will explore issues related to the statistical modeling of software corpora, including topics such as: modeling repetitiveness in source code; use of language models for the code suggestion in IDEs; using probabilistic grammars to mine programming idioms; statistical methods for type inference in a dynamically typed languages; statistical machine translation for porting applications between programming languages, or “mini-fying”Javascript; using statistical language models to find bugs; or statistical methods for automatic code patching, code summarization, code retrieval, code annotation, or test generation.

The workshop follows several earlier workshops on this topic at Microsoft Research, Dagstuhl event, and SIGSOFT FSE.

We are delighted that the workshop will feature two invited speakers: Graham Neubig , of Carnegie-Mellon University, and Danny Tarlow , of Google Brain.


We gratefully acknowledge funding, from NSF, to support a limited number of US travellers to the workshop, especially students and members of under-represented groups, and researchers that might not normally attend AAAI.

Call for participation

We invite you to join us in New Orleans, we have a great schedule of two keynote presentations, and a collection of presentations showcasing teh latest work in this area.


Program overview

Feb 2, 2018

8:30am –  8:45am
8:45am – 10:30am

First Session

Chair Prem Devanbu, UC Davis

- Keynote Talk, Danny Tarlow, Google Brain (Title : "Why Deep Nets are Probably the Best Choice for Modeling Software" )

- Natural Language Processing and Program Analysis For Supporting Todo Comments As Software Evolves (Long)

- Using Natural Language Processing for Documentation Assist (Long)

10:30am – 11am
Coffee Break
11am – 12:30pm

Paper Presentation

Session 1

-NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System (Long)

-Generating Regular Expressions from Natural Language Specifications: Are We There Yet? (Long)

-Studying the Differences Between Natural and Programming Languages (Long)

Automated refactoring of object-oriented code using clustering ensembles (Short)

Improving the quality of Clone Detection with Conceptual Similarity of Source code. (Short)

Towards J.A.R.V.I.S. for Software Engineering: Lessons Learned Implementing a Natural Language Chat Interface (short)

12:30am – 2pm
Lunch, on your own
2:00 PM – 3:30pm

Session 2:

- Can we Learn Type Inference (Long)

- Statistical Machine Translation Is a Natural Fit for Automatic Identifier Renaming in Software Source Code (Long)

Evaluation of Type Inference with Textual Cues (Long)

Cross-Language Learning for Program Classification using Bilateral Tree-Based Convolutional Neural Networks (Short)

Extracting information types from Android layout code using sequence to sequence learning (Short)

Towards Traceability Link Recovery for Self-Adaptive Systems (Short)

3:30pm – 4:00pm
Coffee Break
4:00pm – 6:00pm

Keynote 2 Graham Neubig (Title : Program Synthesis and Description with Structured Machine Learning Models )

Closing Remarks/Discussion

Important dates

October 16, 2017
Workshop Submissions Due (AOE time)
November 9, 2017
Notifications Sent to Authors
November 21, 2017
Final Workshop Papers Due at AAAI

Program Committee

Program Chairs
Prem DevanbuUniversity of California, Davis
William CohenCarnegie-Mellon University
Program Committee
Earl BarrUniversity College, London
Jacob DevlinGoogle
Doug DowneyNorthwestern University
Aditya KanadeIndian Institute of Science
Ray MooneyUT Austin
Graham NeubigCarnegie-Mellon University
Tien NguyenUT Dallas
Dennis PoshyvanykWilliam & Mary
Charles SuttonUniversity of Edinburgh
Bogdan VasilescuCarnegie-Mellon University
Martin VechevETH, Zurich