DataSource

What is a Data Source

A Datasource is an entity within the Virtual Data Assistant that serves as a container for a collection of metadata. Metadata refers to information about datasets, such as data source location, schema, data types, and other relevant properties. Essentially, a Datasource is like a virtual folder that groups related datasets together, making it easier for users to manage and access data efficiently.

How to Create a new Data Source

Navigate to Datasources and click create to create a new data source

Select the connector( source type ) radio button as per the required data source

Advance Properties

If you do not want all tables, then you can add table names in Include or Exclude boxes

Include: Fetch only for these table(s)

Exclude: Fetch all excluding mentioned table(s)

Connectors are modules or plugins that establish connections to specific data sources or databases. VDA offers multiple connectors to popular datasources e.g PostgreSQL, MySQL etc. When users select a connector and provide the necessary connection details, it establishes a link to the data source, allowing access to the data within that source.

Connector Parameters

Postgres

Postgres data source type allows below two methods :-

  • Detail:

    Select this parameter when you are connecting using standard username , password and port .

  • URL:

    Select this parameter when you have a custom URL connection string Below is an example of a connection string

postgresql://user:password@hostname:port/database

Oracle

  • Description: Use this method when connecting to an Oracle database using standard credentials and connection parameters.

  • Parameters:

    • Role: The role assigned to the user (e.g., DBA, Developer).

    • Host: The hostname or IP address of the Oracle server.

    • User: The username for the Oracle database.

    • Password: The password for the Oracle database.

    • Schema: The schema within the Oracle database to which the user has access.

    • Service: The Oracle service name (e.g., ORCL).

URL Method:

  • Description: Use this method when you have a custom URL connection string for the Oracle database.

jdbc:oracle:thin:@//oracle.example.com:1521/ORCL

Kafka

Name:

  • Description: The name of the Kafka connection.

  • Example: Kafka_Prod_Cluster

URL :

  • Description: The base URL for the Kafka service.

  • Importance: Provides the endpoint for connecting to the Kafka service.

  • Example: kafka.example.com

Broker URL:

  • Description: A comma-separated list of host and port pairs that are the addresses of the Kafka brokers.

  • Importance: Specifies the Kafka brokers to connect to.

  • Example: kafka1.example.com:9092,kafka2.example.com:9092

User:

  • Description: The username for the Kafka connection.

  • Importance: Used for authenticating the user accessing the Kafka cluster.

  • Example: kafka_user

Password:

  • Description: The password for the Kafka connection.

  • Importance: Secures the connection by authenticating the user.

  • Example: securepassword

Secure Connection:

  • Description: A checkbox option to enable a secure connection.

  • Importance: Ensures that the data transmitted between the client and Kafka brokers is encrypted.

  • Example: Checked or Unchecked

Included Tables:

  • Description: A list of specific tables/topics to be included in the data ingestion process.

  • Importance: Allows for targeted data ingestion, focusing on relevant tables/topics.

  • Example: topic1, topic2, topic3

Excluded Tables:

  • Description: A list of specific tables/topics to be excluded from the data ingestion process.

  • Importance: Prevents unnecessary or irrelevant data from being ingested.

  • Example: topic4, topic5

Hive

Detail Method

Name:

  • Description: The name of the Hive connection.

  • Importance: Identifies the specific Hive data source within the VDA

  • Example: Hive_Prod_Cluster

Metastore URL:

  • Description: The URL of the Hive Metastore service.

  • Importance: Provides the endpoint for connecting to the Hive Metastore, which manages metadata for Hive tables and databases.

  • Example: thrift://metastore.example.com:9083

Hive URL:

  • Description: The URL of the Hive server.

  • Importance: Provides the endpoint for connecting to the Hive server for executing queries and accessing data.

  • Example: jdbc:hive2://hive.example.com:10000/default

Database:

  • Description: The name of the specific database within the Hive server to connect to.

  • Importance: Specifies the target database for data operations.

  • Example: default

Included Tables:

  • Description: A list of specific tables to be included in the data ingestion process.

  • Importance: Allows for targeted data ingestion, focusing on relevant tables.

  • Example: table1, table2, table3

Excluded Tables:

  • Description: A list of specific tables to be excluded from the data ingestion process.

  • Importance: Prevents unnecessary or irrelevant data from being ingested.

  • Example: table4, table5

SQL Server

Detail Method

Name:

  • Description: The name of the SQL Server connection.

  • Importance: Identifies the specific SQL Server data source within the VDA

  • Example: SQLServer_Prod

Host:

  • Description: The hostname or IP address of the SQL Server.

  • Importance: Specifies the server where the SQL Server is hosted.

  • Example: sqlserver.example.com

Port:

  • Description: The port number used to connect to the SQL Server.

  • Importance: Specifies the network port for the SQL Server connection.

  • Example: 1433

User:

  • Description: The username for the SQL Server database.

  • Importance: Used for authenticating the user accessing the SQL Server.

  • Example: db_user

Password:

  • Description: The password for the SQL Server database.

  • Importance: Secures the connection by authenticating the user.

  • Example: securepassword

Database:

  • Description: The name of the specific database within the SQL Server to connect to.

  • Importance: Specifies the target database for data operations.

  • Example: mydatabase

Schema:

  • Description: The schema within the SQL Server database.

  • Importance: Defines the organizational structure of tables within the database.

  • Example: dbo

Included Tables:

  • Description: A list of specific tables to be included in the data ingestion process.

  • Importance: Allows for targeted data ingestion, focusing on relevant tables.

  • Example: table1, table2, table3

Excluded Tables:

  • Description: A list of specific tables to be excluded from the data ingestion process.

  • Importance: Prevents unnecessary or irrelevant data from being ingested.

  • Example: table4, table5

URL Method

Description: Use this method when you have a custom URL connection string for SQL Server, incorporating all necessary connection details.

jdbc:sqlserver://sqlserver.example.com:1433;databaseName=mydatabase;user=db_user;password=securepassword;schema=dbo

My SQL

Detail Method

Name:

  • Description: The name of the MySQL connection.

  • Importance: Identifies the specific MySQL data source within the VDA

  • Example: MySQL_Prod

Host:

  • Description: The hostname or IP address of the MySQL server.

  • Importance: Specifies the server where the MySQL database is hosted.

  • Example: mysql.example.com

Port:

  • Description: The port number used to connect to the MySQL server.

  • Importance: Specifies the network port for the MySQL connection.

  • Example: 3306

User:

  • Description: The username for the MySQL database.

  • Importance: Used for authenticating the user accessing the MySQL database.

  • Example: db_user

Password:

  • Description: The password for the MySQL database.

  • Importance: Secures the connection by authenticating the user.

  • Example: securepassword

Database:

  • Description: The name of the specific database within the MySQL server to connect to.

  • Importance: Specifies the target database for data operations.

  • Example: mydatabase

Schema:

  • Description: The schema within the MySQL database.

  • Importance: Defines the organizational structure of tables within the database.

  • Example: public

Included Tables:

  • Description: A list of specific tables to be included in the data ingestion process.

  • Importance: Allows for targeted data ingestion, focusing on relevant tables.

  • Example: table1, table2, table3

Excluded Tables:

  • Description: A list of specific tables to be excluded from the data ingestion process.

  • Importance: Prevents unnecessary or irrelevant data from being ingested.

  • Example: table4, table5

URL Method

Description: Use this method when you have a custom URL connection string for MySQL, incorporating all necessary connection details.

jdbc:mysql://mysql.example.com:3306/mydatabase?user=db_user&password=securepassword

List of other connectors

Last updated