DataSource
Last updated
Last updated
A Datasource is an entity within the Virtual Data Assistant that serves as a container for a collection of metadata. Metadata refers to information about datasets, such as data source location, schema, data types, and other relevant properties. Essentially, a Datasource is like a virtual folder that groups related datasets together, making it easier for users to manage and access data efficiently.
Navigate to Datasources and click create to create a new data source
Select the connector( source type ) radio button as per the required data source
If you do not want all tables, then you can add table names in Include or Exclude boxes
Include: Fetch only for these table(s)
Exclude: Fetch all excluding mentioned table(s)
Connectors are modules or plugins that establish connections to specific data sources or databases. VDA offers multiple connectors to popular datasources e.g PostgreSQL, MySQL etc. When users select a connector and provide the necessary connection details, it establishes a link to the data source, allowing access to the data within that source.
Postgres data source type allows below two methods :-
Detail:
Select this parameter when you are connecting using standard username , password and port .
URL:
Select this parameter when you have a custom URL connection string Below is an example of a connection string
postgresql://user:password@hostname:port/database
Description: Use this method when connecting to an Oracle database using standard credentials and connection parameters.
Parameters:
Role: The role assigned to the user (e.g., DBA, Developer).
Host: The hostname or IP address of the Oracle server.
User: The username for the Oracle database.
Password: The password for the Oracle database.
Schema: The schema within the Oracle database to which the user has access.
Service: The Oracle service name (e.g., ORCL).
URL Method:
Description: Use this method when you have a custom URL connection string for the Oracle database.
jdbc:oracle:thin:@//oracle.example.com:1521/ORCL
Name:
Description: The name of the Kafka connection.
Example: Kafka_Prod_Cluster
URL :
Description: The base URL for the Kafka service.
Importance: Provides the endpoint for connecting to the Kafka service.
Example: kafka.example.com
Broker URL:
Description: A comma-separated list of host and port pairs that are the addresses of the Kafka brokers.
Importance: Specifies the Kafka brokers to connect to.
Example: kafka1.example.com:9092,kafka2.example.com:9092
User:
Description: The username for the Kafka connection.
Importance: Used for authenticating the user accessing the Kafka cluster.
Example: kafka_user
Password:
Description: The password for the Kafka connection.
Importance: Secures the connection by authenticating the user.
Example: securepassword
Secure Connection:
Description: A checkbox option to enable a secure connection.
Importance: Ensures that the data transmitted between the client and Kafka brokers is encrypted.
Example: Checked
or Unchecked
Included Tables:
Description: A list of specific tables/topics to be included in the data ingestion process.
Importance: Allows for targeted data ingestion, focusing on relevant tables/topics.
Example: topic1, topic2, topic3
Excluded Tables:
Description: A list of specific tables/topics to be excluded from the data ingestion process.
Importance: Prevents unnecessary or irrelevant data from being ingested.
Example: topic4, topic5
Detail Method
Name:
Description: The name of the Hive connection.
Importance: Identifies the specific Hive data source within the VDA
Example: Hive_Prod_Cluster
Metastore URL:
Description: The URL of the Hive Metastore service.
Importance: Provides the endpoint for connecting to the Hive Metastore, which manages metadata for Hive tables and databases.
Example: thrift://metastore.example.com:9083
Hive URL:
Description: The URL of the Hive server.
Importance: Provides the endpoint for connecting to the Hive server for executing queries and accessing data.
Example: jdbc:hive2://hive.example.com:10000/default
Database:
Description: The name of the specific database within the Hive server to connect to.
Importance: Specifies the target database for data operations.
Example: default
Included Tables:
Description: A list of specific tables to be included in the data ingestion process.
Importance: Allows for targeted data ingestion, focusing on relevant tables.
Example: table1, table2, table3
Excluded Tables:
Description: A list of specific tables to be excluded from the data ingestion process.
Importance: Prevents unnecessary or irrelevant data from being ingested.
Example: table4, table5
Detail Method
Name:
Description: The name of the SQL Server connection.
Importance: Identifies the specific SQL Server data source within the VDA
Example: SQLServer_Prod
Host:
Description: The hostname or IP address of the SQL Server.
Importance: Specifies the server where the SQL Server is hosted.
Example: sqlserver.example.com
Port:
Description: The port number used to connect to the SQL Server.
Importance: Specifies the network port for the SQL Server connection.
Example: 1433
User:
Description: The username for the SQL Server database.
Importance: Used for authenticating the user accessing the SQL Server.
Example: db_user
Password:
Description: The password for the SQL Server database.
Importance: Secures the connection by authenticating the user.
Example: securepassword
Database:
Description: The name of the specific database within the SQL Server to connect to.
Importance: Specifies the target database for data operations.
Example: mydatabase
Schema:
Description: The schema within the SQL Server database.
Importance: Defines the organizational structure of tables within the database.
Example: dbo
Included Tables:
Description: A list of specific tables to be included in the data ingestion process.
Importance: Allows for targeted data ingestion, focusing on relevant tables.
Example: table1, table2, table3
Excluded Tables:
Description: A list of specific tables to be excluded from the data ingestion process.
Importance: Prevents unnecessary or irrelevant data from being ingested.
Example: table4, table5
URL Method
Description: Use this method when you have a custom URL connection string for SQL Server, incorporating all necessary connection details.
jdbc:sqlserver://sqlserver.example.com:1433;databaseName=mydatabase;user=db_user;password=securepassword;schema=dbo
Detail Method
Name:
Description: The name of the MySQL connection.
Importance: Identifies the specific MySQL data source within the VDA
Example: MySQL_Prod
Host:
Description: The hostname or IP address of the MySQL server.
Importance: Specifies the server where the MySQL database is hosted.
Example: mysql.example.com
Port:
Description: The port number used to connect to the MySQL server.
Importance: Specifies the network port for the MySQL connection.
Example: 3306
User:
Description: The username for the MySQL database.
Importance: Used for authenticating the user accessing the MySQL database.
Example: db_user
Password:
Description: The password for the MySQL database.
Importance: Secures the connection by authenticating the user.
Example: securepassword
Database:
Description: The name of the specific database within the MySQL server to connect to.
Importance: Specifies the target database for data operations.
Example: mydatabase
Schema:
Description: The schema within the MySQL database.
Importance: Defines the organizational structure of tables within the database.
Example: public
Included Tables:
Description: A list of specific tables to be included in the data ingestion process.
Importance: Allows for targeted data ingestion, focusing on relevant tables.
Example: table1, table2, table3
Excluded Tables:
Description: A list of specific tables to be excluded from the data ingestion process.
Importance: Prevents unnecessary or irrelevant data from being ingested.
Example: table4, table5
URL Method
Description: Use this method when you have a custom URL connection string for MySQL, incorporating all necessary connection details.
jdbc:mysql://mysql.example.com:3306/mydatabase?user=db_user&password=securepassword
List of other connectors
Amazon Glue and anything built over it
CSV