QueryBridge provides over one million annotated natural language questions aligned with executable SPARQL queries, enabling robust training and evaluation of KGQA systems.
QueryBridge is a large-scale dataset designed to address the long-standing data scarcity problem in Question Answering over Knowledge Graphs (KGQA). Unlike prior benchmarks that contain only thousands of questions, QueryBridge provides 1,004,534 natural language questions paired with executable SPARQL queries over DBpedia.
Enables the development of data-intensive neural semantic parsers and large language models (LLMs) by surmounting the 30k-example "data wall" of legacy benchmarks.
Enriched with structural metadata and XML-style tags (<qt>, <p>, <o>, <cc>) to support both end-to-end and component-level evaluation.
Features comprehensive coverage of multi-hop paths, including Chain (6.77%), Star (56.63%), Tree, Cycle, Flower, and Set-Shape (13.65%) queries.
QueryBridge is systematically generated via Maestro, the first framework to automatically produce comprehensive, utterance-aware benchmarks for any targeted Knowledge Graph.
Figure 1: Maestro architecture for automated benchmark generation. The system (1) selects representative seed entities, (2) instantiates diverse query shapes over the KG, and (3) lexicalizes graph patterns into natural language.
Static benchmarks become stale as KG ontologies evolve. Maestro addresses this by traversing the graph starting from selected seeds to find all valid subgraph shapes, ensuring benchmarks remain accurate.
By mapping external text corpora to KG predicates, Maestro captures semantically-equivalent utterances. This results in high-quality natural language questions that are on par with manually-generated ones.
Instead of random selection, Maestro uses Class Importance ($I_c$) and Entity Popularity ($P_e$) heuristics to ensure representative sampling of both common and tail entities.
QueryBridge is distributed through the Hugging Face datasets library, allowing researchers to seamlessly load, filter, and process the dataset without manual downloads.
from datasets import load_dataset
# Load QueryBridge (≈1.0M question–SPARQL pairs)
# Cached and streamed automatically by Hugging Face
dataset = load_dataset(
"aorogat/QueryBridge"
)
# Inspect dataset structure
print(dataset)
print(dataset["train"].column_names)
# Example 1: Filter by query structure (STAR-shaped queries)
star_queries = dataset["train"].filter(
lambda ex: ex["shapeType"] == "STAR"
)
print(
f"Number of STAR queries: {len(star_queries)}"
)
# Example 2: Filter by reasoning complexity
complex_queries = dataset["train"].filter(
lambda ex: ex["questionComplexity"] >= 0.7
)
# Example 3: Access token-level semantic supervision
sample = dataset["train"][0]
print("Raw question:")
print(sample["questionString"])
print("\nTagged question (NL ↔ SPARQL alignment):")
print(sample["questionStringTagged"])
# Example 4: Retrieve the executable SPARQL query
print("\nCorresponding SPARQL query:")
print(sample["query"])
| Field | Description |
|---|---|
questionString | The raw natural language question. |
questionStringTagged | XML-annotated version linking to SPARQL components. |
query | The executable SPARQL query for DBpedia. |
shapeType | The structural query pattern (Chain, Star, Tree, etc.). |
questionComplexity | Normalized score based on tokens, triples, and keywords. |
answerCardinality | Number of gold standard answers returned. |
If you use QueryBridge, Maestro, or any of their derived resources (e.g., subsets, extensions, benchmarks, or baselines) in your research, please cite the corresponding publications below. Proper citation supports reproducibility, transparency, and sustained maintenance of large-scale benchmarks.