> For the complete documentation index, see [llms.txt](https://docs.vdalive.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.vdalive.com/overview/features/data-sources.md).

# Data Sources

A Datasource is an entity within the Virtual Data Assistant that serves as a container for a collection of metadata. Metadata refers to information about datasets, such as data source location, schema, data types, and other relevant properties. Essentially, a Datasource is like a virtual folder that groups related datasets together, making it easier for users to manage and access data efficiently.

**Datasource creation using Connectors**

<figure><img src="/files/cOD4wSqXrxRmSthzjA6q" alt=""><figcaption><p>Create a datasource</p></figcaption></figure>

Connectors are modules or plugins that establish connections to specific data sources or databases. VDA offers multiple connectors  to popular datasources e.g PostgreSQL, MySQL etc. When users select a connector and provide the necessary connection details, it establishes a link to the data source, allowing access to the data within that source.

**Searching  Datasets within a Datasource**

<figure><img src="/files/nkMtMTAxBC2r3WSces23" alt=""><figcaption></figcaption></figure>

Once datasets are created and organized within a Datasource, users can perform searches to find specific entities or datasets within that Datasource. This search functionality simplifies data discovery, especially in cases where multiple datasets are stored within the same Datasource.An user can search using specific filters to find appropriate datasets.

**List of Available Connectors**

* [Amazon Athena](https://aws.amazon.com/athena/)
* [Amazon EventBridge](https://aws.amazon.com/eventbridge/)
* [Amazon Glue](https://aws.amazon.com/glue/) and anything built over it
* [Amazon Redshift](https://aws.amazon.com/redshift/)
* [Apache Cassandra](https://cassandra.apache.org/)
* [Apache Druid](https://druid.apache.org/)
* [Apache Hive](https://hive.apache.org/)
* CSV
* [dbt](https://www.getdbt.com/)
* [Delta Lake](https://delta.io/)
* [Elasticsearch](https://www.elastic.co/)
* [Google BigQuery](https://cloud.google.com/bigquery)
* [IBM DB2](https://www.ibm.com/analytics/db2)
* [Kafka Schema Registry](https://docs.confluent.io/platform/current/schema-registry/index.html)
* [Microsoft SQL Server](https://www.microsoft.com/en-us/sql-server/default.aspx)
* [MySQL](https://www.mysql.com/)
* [Oracle](https://www.oracle.com/index.html) (through dbapi or sql\_alchemy)
* [PostgreSQL](https://www.postgresql.org/)
* [PrestoDB](http://prestodb.io/)
* [Trino (formerly Presto SQL)](https://trino.io/)
* [Vertica](https://www.vertica.com/)
* [Snowflake](https://www.snowflake.com/)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.vdalive.com/overview/features/data-sources.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
