> For the complete documentation index, see [llms.txt](https://docs.vdalive.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.vdalive.com/overview/features/data-sources.md).

# Data Sources

A Datasource is an entity within the Virtual Data Assistant that serves as a container for a collection of metadata. Metadata refers to information about datasets, such as data source location, schema, data types, and other relevant properties. Essentially, a Datasource is like a virtual folder that groups related datasets together, making it easier for users to manage and access data efficiently.

**Datasource creation using Connectors**

<figure><img src="/files/cOD4wSqXrxRmSthzjA6q" alt=""><figcaption><p>Create a datasource</p></figcaption></figure>

Connectors are modules or plugins that establish connections to specific data sources or databases. VDA offers multiple connectors  to popular datasources e.g PostgreSQL, MySQL etc. When users select a connector and provide the necessary connection details, it establishes a link to the data source, allowing access to the data within that source.

**Searching  Datasets within a Datasource**

<figure><img src="/files/nkMtMTAxBC2r3WSces23" alt=""><figcaption></figcaption></figure>

Once datasets are created and organized within a Datasource, users can perform searches to find specific entities or datasets within that Datasource. This search functionality simplifies data discovery, especially in cases where multiple datasets are stored within the same Datasource.An user can search using specific filters to find appropriate datasets.

**List of Available Connectors**

* [Amazon Athena](https://aws.amazon.com/athena/)
* [Amazon EventBridge](https://aws.amazon.com/eventbridge/)
* [Amazon Glue](https://aws.amazon.com/glue/) and anything built over it
* [Amazon Redshift](https://aws.amazon.com/redshift/)
* [Apache Cassandra](https://cassandra.apache.org/)
* [Apache Druid](https://druid.apache.org/)
* [Apache Hive](https://hive.apache.org/)
* CSV
* [dbt](https://www.getdbt.com/)
* [Delta Lake](https://delta.io/)
* [Elasticsearch](https://www.elastic.co/)
* [Google BigQuery](https://cloud.google.com/bigquery)
* [IBM DB2](https://www.ibm.com/analytics/db2)
* [Kafka Schema Registry](https://docs.confluent.io/platform/current/schema-registry/index.html)
* [Microsoft SQL Server](https://www.microsoft.com/en-us/sql-server/default.aspx)
* [MySQL](https://www.mysql.com/)
* [Oracle](https://www.oracle.com/index.html) (through dbapi or sql\_alchemy)
* [PostgreSQL](https://www.postgresql.org/)
* [PrestoDB](http://prestodb.io/)
* [Trino (formerly Presto SQL)](https://trino.io/)
* [Vertica](https://www.vertica.com/)
* [Snowflake](https://www.snowflake.com/)