# DataSource

## What is a Data Source

A Datasource is an entity within the Virtual Data Assistant that serves as a container for a collection of metadata. Metadata refers to information about datasets, such as data source location, schema, data types, and other relevant properties. Essentially, a Datasource is like a virtual folder that groups related datasets together, making it easier for users to manage and access data efficiently.

## How to Create a new Data Source

Navigate to Datasources and click create to create a new data source

<figure><img src="/files/wDvevcuf6tvXtX4XVWqX" alt=""><figcaption><p>create new data source</p></figcaption></figure>

Select the connector( source type ) radio button as per the  required  data source

<figure><img src="/files/cOD4wSqXrxRmSthzjA6q" alt=""><figcaption><p>Create a datasource</p></figcaption></figure>

#### Advance Properties&#x20;

If you do not want all tables, then you can add table names in Include or Exclude boxes

&#x20;    **Include**: Fetch only for these table(s)

&#x20;    **Exclude**: Fetch all excluding mentioned table(s)

Connectors are modules or plugins that establish connections to specific data sources or databases. VDA offers multiple connectors  to popular datasources e.g PostgreSQL, MySQL etc. When users select a connector and provide the necessary connection details, it establishes a link to the data source, allowing access to the data within that source.

### Connector Parameters

#### Postgres

<figure><img src="/files/Wl3qhN4ptyOpxd2KdqIM" alt=""><figcaption><p>Postgres Connection</p></figcaption></figure>

Postgres data source type allows  below two methods :-

* **Detail**:

  Select this parameter when you are connecting using standard username , password and port .

* **URL**:

  Select this parameter when you have a custom URL  connection string  Below is an example of a connection string&#x20;

&#x20;  `postgresql://user:password@hostname:port/database`

#### Oracle

<figure><img src="/files/4XdMAimiAZchRpZrIOvF" alt=""><figcaption><p>Oracle Connection</p></figcaption></figure>

* **Description**: Use this method when connecting to an Oracle database using standard credentials and connection parameters.
* **Parameters**:
  * **Role**: The role assigned to the user (e.g., DBA, Developer).
  * **Host**: The hostname or IP address of the Oracle server.
  * **User**: The username for the Oracle database.
  * **Password**: The password for the Oracle database.
  * **Schema**: The schema within the Oracle database to which the user has access.
  * **Service**: The Oracle service name (e.g., ORCL).

<figure><img src="/files/YAakXfupAq0dARZqfHTc" alt=""><figcaption><p>Oracle Connection - Advance</p></figcaption></figure>

**URL Method**:

* **Description**: Use this method when you have a custom URL connection string for the Oracle database.

`jdbc:oracle:thin:@//oracle.example.com:1521/ORCL`

#### Kafka

<figure><img src="/files/66kLrCJBsfVFFPKJhgl2" alt=""><figcaption><p>Kafka Connection - Advance</p></figcaption></figure>

**Name**:

* **Description**: The name of the Kafka connection.
* **Example**: `Kafka_Prod_Cluster`

**URL** :

* **Description**: The base URL for the Kafka service.
* **Importance**: Provides the endpoint for connecting to the Kafka service.
* **Example**: `kafka.example.com`

**Broker URL**:

* **Description**: A comma-separated list of host and port pairs that are the addresses of the Kafka brokers.
* **Importance**: Specifies the Kafka brokers to connect to.
* **Example**: `kafka1.example.com:9092,kafka2.example.com:9092`

**User**:

* **Description**: The username for the Kafka connection.
* **Importance**: Used for authenticating the user accessing the Kafka cluster.
* **Example**: `kafka_user`

**Password**:

* **Description**: The password for the Kafka connection.
* **Importance**: Secures the connection by authenticating the user.
* **Example**: `securepassword`

**Secure Connection**:

* **Description**: A checkbox option to enable a secure connection.
* **Importance**: Ensures that the data transmitted between the client and Kafka brokers is encrypted.
* **Example**: `Checked` or `Unchecked`

**Included Tables**:

* **Description**: A list of specific tables/topics to be included in the data ingestion process.
* **Importance**: Allows for targeted data ingestion, focusing on relevant tables/topics.
* **Example**: `topic1, topic2, topic3`

**Excluded Tables**:

* **Description**: A list of specific tables/topics to be excluded from the data ingestion process.
* **Importance**: Prevents unnecessary or irrelevant data from being ingested.
* **Example**: `topic4, topic5`

#### Hive

<figure><img src="/files/hdT1N94Y3odCjJHhkl8b" alt=""><figcaption><p>Hive Connection - Advance</p></figcaption></figure>

**Detail Method**

**Name**:

* **Description**: The name of the Hive connection.
* **Importance**: Identifies the specific Hive data source within the VDA
* **Example**: `Hive_Prod_Cluster`

**Metastore URL**:

* **Description**: The URL of the Hive Metastore service.
* **Importance**: Provides the endpoint for connecting to the Hive Metastore, which manages metadata for Hive tables and databases.
* **Example**: `thrift://metastore.example.com:9083`

**Hive URL**:

* **Description**: The URL of the Hive server.
* **Importance**: Provides the endpoint for connecting to the Hive server for executing queries and accessing data.
* **Example**: `jdbc:hive2://hive.example.com:10000/default`

**Database**:

* **Description**: The name of the specific database within the Hive server to connect to.
* **Importance**: Specifies the target database for data operations.
* **Example**: `default`

**Included Tables**:

* **Description**: A list of specific tables to be included in the data ingestion process.
* **Importance**: Allows for targeted data ingestion, focusing on relevant tables.
* **Example**: `table1, table2, table3`

**Excluded Tables**:

* **Description**: A list of specific tables to be excluded from the data ingestion process.
* **Importance**: Prevents unnecessary or irrelevant data from being ingested.
* **Example**: `table4, table5`

#### SQL Server

<figure><img src="/files/58BdhxlXssfwf3cWjH83" alt=""><figcaption><p>SQL Server Connection</p></figcaption></figure>

**Detail Method**

**Name**:

* **Description**: The name of the SQL Server connection.
* **Importance**: Identifies the specific SQL Server data source within the VDA
* **Example**: `SQLServer_Prod`

**Host**:

* **Description**: The hostname or IP address of the SQL Server.
* **Importance**: Specifies the server where the SQL Server is hosted.
* **Example**: `sqlserver.example.com`

**Port**:

* **Description**: The port number used to connect to the SQL Server.
* **Importance**: Specifies the network port for the SQL Server connection.
* **Example**: `1433`

**User**:

* **Description**: The username for the SQL Server database.
* **Importance**: Used for authenticating the user accessing the SQL Server.
* **Example**: `db_user`

**Password**:

* **Description**: The password for the SQL Server database.
* **Importance**: Secures the connection by authenticating the user.
* **Example**: `securepassword`

**Database**:

* **Description**: The name of the specific database within the SQL Server to connect to.
* **Importance**: Specifies the target database for data operations.
* **Example**: `mydatabase`

**Schema**:

* **Description**: The schema within the SQL Server database.
* **Importance**: Defines the organizational structure of tables within the database.
* **Example**: `dbo`

**Included Tables**:

* **Description**: A list of specific tables to be included in the data ingestion process.
* **Importance**: Allows for targeted data ingestion, focusing on relevant tables.
* **Example**: `table1, table2, table3`

**Excluded Tables**:

* **Description**: A list of specific tables to be excluded from the data ingestion process.
* **Importance**: Prevents unnecessary or irrelevant data from being ingested.
* **Example**: `table4, table5`

**URL Method**

**Description**: Use this method when you have a custom URL connection string for SQL Server, incorporating all necessary connection details.

`jdbc:sqlserver://sqlserver.example.com:1433;databaseName=mydatabase;user=db_user;password=securepassword;schema=dbo`

#### My SQL

<figure><img src="/files/JCp2Cq4AymWsOPRjNU9Z" alt=""><figcaption></figcaption></figure>

**Detail Method**

**Name**:

* **Description**: The name of the MySQL connection.
* **Importance**: Identifies the specific MySQL data source within the VDA
* **Example**: `MySQL_Prod`

**Host**:

* **Description**: The hostname or IP address of the MySQL server.
* **Importance**: Specifies the server where the MySQL database is hosted.
* **Example**: `mysql.example.com`

**Port**:

* **Description**: The port number used to connect to the MySQL server.
* **Importance**: Specifies the network port for the MySQL connection.
* **Example**: `3306`

**User**:

* **Description**: The username for the MySQL database.
* **Importance**: Used for authenticating the user accessing the MySQL database.
* **Example**: `db_user`

**Password**:

* **Description**: The password for the MySQL database.
* **Importance**: Secures the connection by authenticating the user.
* **Example**: `securepassword`

**Database**:

* **Description**: The name of the specific database within the MySQL server to connect to.
* **Importance**: Specifies the target database for data operations.
* **Example**: `mydatabase`

**Schema**:

* **Description**: The schema within the MySQL database.
* **Importance**: Defines the organizational structure of tables within the database.
* **Example**: `public`

**Included Tables**:

* **Description**: A list of specific tables to be included in the data ingestion process.
* **Importance**: Allows for targeted data ingestion, focusing on relevant tables.
* **Example**: `table1, table2, table3`

**Excluded Tables**:

* **Description**: A list of specific tables to be excluded from the data ingestion process.
* **Importance**: Prevents unnecessary or irrelevant data from being ingested.
* **Example**: `table4, table5`

**URL Method**

**Description**: Use this method when you have a custom URL connection string for MySQL, incorporating all necessary connection details.

`jdbc:mysql://mysql.example.com:3306/mydatabase?user=db_user&password=securepassword`

**List of  other connectors**

* [Amazon Athena](https://aws.amazon.com/athena/)
* [Amazon EventBridge](https://aws.amazon.com/eventbridge/)
* [Amazon Glue](https://aws.amazon.com/glue/) and anything built over it
* [Amazon Redshift](https://aws.amazon.com/redshift/)
* [Apache Cassandra](https://cassandra.apache.org/)
* [Apache Druid](https://druid.apache.org/)
* [Apache Hive](https://hive.apache.org/)
* CSV
* [dbt](https://www.getdbt.com/)
* [Delta Lake](https://delta.io/)
* [Elasticsearch](https://www.elastic.co/)
* [Google BigQuery](https://cloud.google.com/bigquery)
* [IBM DB2](https://www.ibm.com/analytics/db2)
* [PrestoDB](http://prestodb.io/)
* [Trino (formerly Presto SQL)](https://trino.io/)
* [Vertica](https://www.vertica.com/)
* [Snowflake](https://www.snowflake.com/)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.vdalive.com/how-to-guides/data-catalog/datasource.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
