A parser for C# using flex/bison

James Power

Department of Computer Science, National University of Ireland, Maynooth.


As part of a separate project I knocked together a flex/bison scanner and parser for C#. I really just wanted a clean C# grammar, but the one in the spec. had a lot of duplication.

I worked off version 0.28 of the C# Language Specification (of 5/7/2001). I've tested the scanner/parser on about 3,000 C# programs, and it seems to be working ok. In all cases, when resolving conflicts in the grammar I've gone for the "most general" approach - thus, the parser should accept all valid programs, but will also allow some invalid ones as well.

There are a few points you should note:

C# has what seem to be "context sensitive keywords" - that is, words which can be either keywords or identifiers depending on where they occur in the program. It seems a particularly silly thing to design into a language. Anyway, I've put in some flex states to deal with this.

You might also like to check out the mono C# compiler. This has a parser for a bison-like tool (that generates C#), along with lots of C# code. (I'm not connected with the mono project in any way).

I presented a paper about the parser's design as Applying Software Engineering Techniques to Parser Design at the Conference of the South African Institute of Computer Scientists and Information Technologists, in Port Elizabeth, South Africa, September 16-18 2002.

If you find any bugs in this, or thing there's something else I should mention here, let me know.

Download

The scanner is csharp-lex.l, and the parser is in csharp.y. There's just one (non-generated) header file, lex.yy.h needed to tie these together.

To make an application out of this, here's a minimal program main.c, and a Makefile.

Or you can get all of these files as a single .zip file: csharp.zip.

I made these using flex version 2.5.4 and bison version 1.28, but I don't think there's anything too special in there. I do use exclusive start conditions as well as start-condition stacks in the lexer - these may not be available in all flex/lex variants.

 


James Power
Last Modified: 19 Dec 2004