Curriculum Vitae

Basic Info

Chucky Ellison
New York, NY
cme@freefour.com

Experience as Software Engineer

  • OkCupid (October 2012–March 2015)
    • Developed algorithm and implemented system returning correlated interests to a user's interests.
    • Developed pluggable system to easily evaluate AB testing for website.
    • Designed and implemented new search system for making recommendations based on users' previous actions.
    • Developed techniques to analyze and evaluate site health.
  • Intel Corporation (Summers 2003, 2004, 2006, and 2007)
    • Developed tools to optimize loading data and to discover relationships between data in massive databases of manufacturing data.
    • Received highest reviews last 3 years.

Teaching

  • Mentored two CS undergrads at University of Illinois (2010–2012)
  • TA/Studio Lead for Programming Studio (Spring 2006–Fall 2007; Fall 2009)
    • Taught programming hygiene, structure, and engineering tradeoffs in small sessions.
    • Recognized on the "Incomplete List of Teachers Ranked as Excellent" for Spring 2006 and Fall 2006, and the renamed "List of Teachers Ranked as Excellent" for Fall 2007. Additionally received departmental teaching award for Fall 2009.
  • TA for Software Engineering (Fall 2005)
  • Counselor/TA at Governor's School West (Summer 2002)

Education

  • University of Illinois
  • North Carolina State University (December 2005)
    • BS in Computer Science
    • BS in Mathematics (Focus in Abstract Algebra)
    • Minor in Cognitive Science

Projects

Major

The Better Meta
An aggregate stats page for Hi-Rez's free-to-play, team-based, first-person shooter Paladins. Has interactive, auto-updating graphs informing players of top picks and strategies.
CFGLib
A C# library for (Probabilistic) Context Free Grammars (PCFGs). Supports probability-preserving conversion to Chomsky Normal Form (CNF), as well as the CYK and Earley parsing algorithms. These algorithms allow you to tell whether or not a grammar could have generated a given string, and if so, at what probability they are generated. In addition, the Earley algorithm supports returning a parse forest (containing all possible parse trees) in Shared Packed Parse Forest (SPPF) format.
C-Semantics
This project represents the ongoing effort to fully specify the semantics of the C programming language using the semantic definitional framework K. My contributions were described in my dissertation, a paper on the positive semantics, and a technical report/paper on the negative semantics. For this project, I used continuous integration and a huge number of integration and regression tests to make sure the semantics stayed correct as it became more complete.
LLVM-Semantics
This project is a collaboration between me and David Lazar to give an executable semantics to the LLVM assembly language (LLVM IR) in the K Semantic Framework. A significant subset of the language was defined, but the work was not completed.
Ricochet
Ricochet RPC is an RPC framework for C# focusing on speed and robustness.

Minor

Bootstrap
Simple bootstrapped two-sample t-test in C#.
Esolang Semantic
Formal semantics for some simple esoteric programming languages.
Getrude
Gertrude is a simple term rewriting language, written in Go.
sobriquet
A very simple name generator for C# based on Markov chains.
Stopwatch
A small library for Rust (my first foray into learning the language) used for timing things.
WarpWallet
This is an implementation of WarpWallet (a brain wallet generator for Bitcoin) in Go.

Publications

Theses

  • A Formal Semantics of C with Applications
    Chucky Ellison
    PhD Dissertation, University of Illinois, July 2012.
    URI PDF Project Slides

    This dissertation shows that complex, real programming languages can be completely formalized in the K Framework, yielding interpreters and analysis tools for testing and bug detection. This is demonstrated by providing, in K, the first complete formal semantics of the C programming language. With varying degrees of effort, tools such as interpreters, debuggers, and model-checkers, together with tools that check for memory safety, races, deadlocks, and undefined behavior are then generated from the semantics.

    Being executable, the semantics has been thoroughly tested against the GCC torture test suite and successfully passes 99.2% of 776 test programs. The semantics is also evaluated against popular analysis tools, using a new test suite in addition to a third-party test suite. The semantics-based tool performs at least as well or better than the other tools tested.

  • A Rewriting Logic Approach to Defining Type Systems
    Chucky Ellison
    Masters Thesis, University of Illinois, 2008.
    URI PDF

    We show how programming language semantics and definitions of their corresponding type systems can both be written in a single framework amenable to proofs of soundness. The framework is based on full rewriting logic (not to be confused with context reduction or term rewriting), where rules can match anywhere in a term (or configuration).

    We present an extension of the syntactic approach to proving type system soundness presented by Wright and Felleisen [1994] that works in the above described semantics-based domain. As before, the properties of preservation and progress are crucial. We use an abstraction function to relate semantic configurations in the language domain to semantic configurations in the type domain, and then proceed to use the preservation and progress properties as usual. We also develop an abstract type system, which is a type system modulo certain structural characteristics.

    To demonstrate the method, we give examples of five languages and corresponding type systems. They include two imperative languages and three functional languages, and three type checkers and two type inferencers. We then proceed to prove that preservation holds for each.

Conference Papers

  • Defining the Undefinedness of C
    Chris Hathhorn, Chucky Ellison, and Grigore Roșu
    Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'15), pp. 336–345. ACM, June 2015.
    DOI PDF Slides

    We present a "negative" semantics of the C11 language—a semantics that does not just give meaning to correct programs, but also rejects undefined programs. We investigate undefined behavior in C and discuss the techniques and special considerations needed for formally specifying it. We have used these techniques to modify and extend a semantics of C into one that captures undefined behavior. The amount of semantic infrastructure and effort required to achieve this was unexpectedly high, in the end nearly doubling the size of the original semantics. From our semantics, we have automatically extracted an undefinedness checker, which we evaluate against other popular analysis tools, using our own test suite in addition to a third-party test suite. Our checker is capable of detecting examples of all 77 categories of core language undefinedness appearing in the C11 standard, more than any other tool we considered. Based on this evaluation, we argue that our work is the most comprehensive and complete semantic treatment of undefined behavior in C, and thus of the C language itself.

  • Executing Formal Semantics with the K Tool
    David Lazar, Andrei Arusoaie, Traian Florin Șerbănuță, Chucky Ellison, Radu Mereuta, Dorel Lucanu, and Grigore Roșu
    Formal Methods (FM'12), Lecture Notes in Computer Science 7436, pp. 267–271. Springer, 2012.
    DOI PDF Slides

    This paper describes the K tool, a system for formally defining programming languages. Formal definitions created using the K tool automatically yield an interpreter for the language, as well as program analysis tools such as a state-space explorer. The modularity of K and the design of the tool allow one semantics to be used for several applications.

  • Test-case Reduction for C Compiler Bugs
    John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang
    Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'12), pp. 335–346. ACM, 2012.
    DOI PDF Project Slides

    To report a compiler bug, one must often find a small test case that triggers the bug. The existing approach to automated test-case reduction, delta debugging, works by removing substrings of the original input; the result is a concatenation of substrings that delta cannot remove. We have found this approach less than ideal for reducing C programs because it typically yields test cases that are too large or even invalid (relying on undefined behavior). To obtain small and valid test cases consistently, we designed and implemented three new, domain-specific test-case reducers. The best of these is based on a novel framework in which a generic fixpoint computation invokes modular transformations that perform reduction operations. This reducer produces outputs that are, on average, more than 25 times smaller than those produced by our other reducers or by the existing reducer that is most commonly used by compiler developers. We conclude that effective program reduction requires more than straightforward delta debugging.

  • An Executable Formal Semantics of C with Applications
    Chucky Ellison and Grigore Roșu
    Proceedings of the 39th Symposium on Principles of Programming Languages (POPL'12), pp. 533–544. ACM, 2012.
    DOI PDF Project Slides

    This paper describes an executable formal semantics of C. Being executable, the semantics has been thoroughly tested against the GCC torture test suite and successfully passes 99.2% of 776 test programs. It is the most complete and thoroughly tested formal definition of C to date. The semantics yields an interpreter, debugger, state space search tool, and model checker "for free". The semantics is shown capable of automatically finding program errors, both statically and at runtime. It is also used to enumerate nondeterministic behavior.

  • Matching Logic: An Alternative to Hoare/Floyd Logic
    Grigore Roșu, Chucky Ellison, and Wolfram Schulte
    Proceedings of the 13th International Conference on Algebraic Methodology And Software Technology (AMAST'10), Lecture Notes in Computer Science 6486, pp. 142–162. Springer, 2010.
    DOI PDF Project Slides

    This paper introduces matching logic, a novel framework for defining axiomatic semantics for programming languages, inspired from operational semantics. Matching logic specifications are particular first-order formulae with constrained algebraic structure, called patterns. Program configurations satisfy patterns iff they match their algebraic structure and satisfy their constraints. Using a simple imperative language (IMP), it is shown that a restricted use of the matching logic proof system is equivalent to IMP's Hoare logic proof system, in that any proof derived using either can be turned into a proof using the other. Extensions to IMP including a heap with dynamic memory allocation and pointer arithmetic are given, requiring no extension of the underlying first-order logic; moreover, heap patterns such as lists, trees, queues, graphs, etc., are given algebraically using fist-order constraints over patterns.

Workshop Papers

  • The K Primer (version 3.3)
    Traian Florin Șerbănuță, Andrei Arusoaie, David Lazar, Chucky Ellison, Dorel Lucanu, and Grigore Roșu
    Proceedings of the 2nd International Workshop on the K Framework and Its Applications (K'11), Electronic Notes in Theoretical Computer Science 304, pp. 57–80. 2014.
    DOI PDF Project

    This paper serves as a brief introduction to the K tool, a system for formally defining programming languages. It is shown how sequential or concurrent languages can be defined in K simply and modularly. These formal definitions automatically yield an interpreter for the language, as well as program analysis tools such as a state-space explorer.

  • Making Maude Definitions More Interactive
    Andrei Arusoaie, Traian Florin Șerbănuță, Chucky Ellison, and Grigore Roșu
    Rewriting Logic and Its Applications (WRLA'12), Lecture Notes in Computer Science 7571, pp. 83–98. Springer, 2012.
    DOI PDF Slides

    This paper presents an interface for achieving interactive executions of Maude terms by allowing console and file input/output (I/O) operations. This interface consists of a Maude API for I/O operations, a Java-based server offering I/O capabilities, and a communication protocol between the two implemented using the external objects concept and Maude's TCP sockets. This interface was evaluated as part of the K framework, providing interactive interpreter capabilities for executing and testing programs for multiple language definitions.

  • A Rewriting Logic Approach to Type Inference
    Chucky Ellison, Traian Florin Șerbănuță, and Grigore Roșu
    Recent Trends in Algebraic Development Techniques (WADT'08), Lecture Notes in Computer Science 5486, pp. 135–151. Springer, 2009.
    DOI PDF Slides

    Meseguer and Roșu proposed rewriting logic semantics (RLS) as a programing language definitional framework that unifies operational and algebraic denotational semantics. RLS has already been used to define a series of didactic and real languages, but its benefits in connection with defining and reasoning about type systems have not been fully investigated. This paper shows how the same RLS style employed for giving formal definitions of languages can be used to define type systems. The same term-rewriting mechanism used to execute RLS language definitions can now be used to execute type systems, giving type checkers or type inferencers. The proposed approach is exemplified by defining the Hindley-Milner polymorphic type inferencer \(\mathcal{W}\) as a rewrite logic theory and using this definition to obtain a type inferencer by executing it in a rewriting logic engine. The inferencer obtained this way compares favorably with other definitions or implementations of \(\mathcal{W}\). The performance of the executable definition is within an order of magnitude of that of highly optimized implementations of type inferencers, such as that of OCaml.

  • On RDBMS-Integrated Disk-Based Architecture for Managing Massive Dormant Data in a Compressed Format
    Miroslav Dzakovic and Chucky Ellison
    Proceedings of the 4th International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI'07), pp. 55–59. IEEE, 2007.
    DOI PDF Slides

    Managing dormant data in very large databases (VLDB) is about striking a balance between cost and performance. While magnetic tapes have long been a standard answer, the downsides limit the applicability for all VLDB. This paper describes a class of engineering and scientific data management applications that need a cost-effective solution with better support for updating and lower granularity of data access. To meet the needs of such applications, it then introduces an architectural extension to RDBMS. In the architecture, data is compressed using custom compression techniques, stored outside an RDBMS, and made available through views using SQL Table functions. This architecture is applicable to many industrial RDBMS, and it offers both linear speed-up and scale-out. The paper then discusses preliminary findings from a prototype implementation on Oracle RDBMS.

Technical Reports

  • On Compiling Rewriting Logic Language Definitions into Competitive Interpreters
    Michael Ilseman, Chucky Ellison, and Grigore Roșu
    University of Illinois, December 2010.
    URI PDF

    This paper describes a completely automated method for generating efficient and competitive interpreters from formal semantics expressed in Rewriting Logic. The semantics are compiled into OCaml code, which then acts as the interpreter for the language being defined. This automatic translation is tested on the semantics of an imperative as well as a functional language, and these generated interpreters are then benchmarked across a number of programs. In all cases the compiled interpreter is faster than directly executing the definition in a Rewriting system with improvements of several orders of magnitude.

  • Defining the Undefinedness of C
    Chucky Ellison and Grigore Roșu
    University of Illinois, April 2012.
    URI PDF

    This paper investigates undefined behavior in C and offers a few simple techniques for operationally specifying such behavior formally. A semantics-based undefinedness checker for C is developed using these techniques, as well as a test suite of undefined programs. The tool is evaluated against other popular analysis tools, using the new test suite in addition to a third-party test suite. The semantics-based tool performs at least as well or better than the other tools tested.

Talks

  • An Executable Formal Semantics of C with Applications
    39th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'12). Philadelphia, PA. January 27, 2012.
    Slides Site Video (with ACM access)

  • An Executable Formal Semantics of C with Applications
    3rd Midwest Verification Day (MVD'11). Minneapolis, MN. September 30, 2011.
    Slides Site

  • Compiling K Definitions into Competitive Interpreters
    2nd International Workshop on the K Framework and its Applications (K'11). Cheile Grădiștei, Romania. August 8, 2011.
    Slides Site

  • The K Framework and a Formal Semantics of C
    2nd Midwest Verification Day (MVD'10). Iowa City, IA. September 17, 2010.
    Slides Site

  • Formalizing Experimental Languages
    1st International Workshop on the K Framework and its Applications (K'10). Nags Head, NC. August 16, 2010.
    Site

  • Matching Logic
    Integrated Vehicle Health Management / Software Health Management Technical Interchange Meeting. Co-located with NASA Aviation Safety Program Technical Conference (AvSAFE'09). Tyson's Corner, VA. November 19, 2009.
    Slides

  • "What is the answer to this question?" or, Self-reference and inherent limitations of computers and the brain
    The National Youth Leadership Forum on Technology (NYLF/TECH). San Jose, CA. July 12, 2006.