We have just completed building a more sophisticated .NET based SOQL parser for our S4S and G4S products.

The parser can take complex Salesforce SOQL queries and create a .NET tree structure that indicates the relationships between each query object. The hierarchical tree structure enables developers to do things like:

  • Dynamically manipulate a pre-existing SOQL query
  • Validate a user created SOQL query by processing then repeating the interpreted query back to the user
  • Deconstruct SOQL queries so the attributes of the query can be extracted and used in .NET systems for example to
    • Verify the query will not cause the requested recordset to exceed API limits
    • Translate the SOQL query to an equivalent SQL query

The parser is a predictive LL(1) recursive descent parser meaning it is a top-down parser for a subset of the "LL" grammars. It parses from Left to right, constructing a Leftmost derivation of the sentence. The parser requires one token of lookup to make an interpretive decision.