llvm/clang-tools-extra/pseudo/include/clang-pseudo/Disambiguate.h

//===--- Disambiguate.h - Find the best tree in the forest -------*- C++-*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// A GLR parse forest represents every possible parse tree for the source code.
//
// Before we can do useful analysis/editing of the code, we need to pick a
// single tree which we think is accurate. We use three main types of clues:
//
// A) Semantic language rules may restrict which parses are allowed.
//    For example, `string string string X` is *grammatical* C++, but only a
//    single type-name is allowed in a decl-specifier-sequence.
//    Where possible, these interpretations are forbidden by guards.
//    Sometimes this isn't possible, or we want our parser to be lenient.
//
// B) Some constructs are rarer, while others are common.
//    For example `a<b>::c` is often a template specialization, and rarely a
//    double comparison between a, b, and c.
//
// C) Identifier text hints whether they name types/values/templates etc.
//    "std" is usually a namespace, a project index may also guide us.
//    Hints may be within the document: if one occurrence of 'foo' is a variable
//    then the others probably are too.
//    (Text need not match: similar CaseStyle can be a weak hint, too).
//
//----------------------------------------------------------------------------//
//
// Mechanically, we replace each ambiguous node with its best alternative.
//
// "Best" is determined by assigning bonuses/penalties to nodes, to express
// the clues of type A and B above. A forest node representing an unlikely
// parse would apply a penalty to every subtree is is present in.
// Disambiguation proceeds bottom-up, so that the score of each alternative
// is known when a decision is made.
//
// Identifier-based hints within the document mean some nodes should be
// *correlated*. Rather than resolve these simultaneously, we make the most
// certain decisions first and use these results to update bonuses elsewhere.
//
//===----------------------------------------------------------------------===//

#include "clang-pseudo/Forest.h"

namespace clang::pseudo {

struct DisambiguateParams {};

// Maps ambiguous nodes onto the index of their preferred alternative.
Disambiguation;

// Resolve each ambiguous node in the forest.
// Maps each ambiguous node to the index of the chosen alternative.
// FIXME: current implementation is a placeholder and chooses arbitrarily.
Disambiguation disambiguate(const ForestNode *Root,
                            const DisambiguateParams &Params);

// Remove all ambiguities from the forest, resolving them according to Disambig.
void removeAmbiguities(ForestNode *&Root, const Disambiguation &Disambig);

} // namespace clang::pseudo