Using Trino
The analytics platform provides Trino as a service for data analysis. Trino is a distributed query engine that accesses data stored on object storage through ANSI SQL. This is also used for interactive query and analysis.
Trino is integrated with enterprise authentication and authorization automation to ensure seamless access provisioning with access ownership at the dataset level residing with the business unit owning the data. With Trino resource management and tuning, we ensure 95% of the queries are completed in less than 10 seconds to allow interactive UI and dashboard fetching data directly from Trino. This avoids the data duplication that can happen when creating multi-purpose data cubes.
DBeaver is a universal database administration tool to manage relational and NoSQL databases. Users can connect to Trino from DBeaver to perform the SQL operations on the Trino tables.

On the left-hand menu of the Platform Dashboard, select Services and then select New Services.
In the Create a new service dialogue, complete the following:
Basic Settings: Configure your service by entering the following details:
Service type: Select Trino from the list.
Service name: Enter a unique service name. This name is listed on the Services page.
Description: Enter the description of the service.
Enabled: The check box is selected by default. Selecting the option allows you to configure the Common and Custom parameters for the service.
Common Parameters: Configure the memory and CPU resources for the service. The platform uses the default system values if you do not enter any values.
Note
When setting the resource limits, consider that an insufficient limit might fail to execute the queries.
Memory: Provide a minimum and maximum memory based on requirements by analyzing the cluster size, resources and available memory on nodes. Trino uses memory only within the specified limit.
CPU: Provide a minimum and maximum number of CPUs based on the requirement by analyzing cluster size, resources and availability on nodes. Trino uses CPU only the specified limit.
Priority Class: By default, the priority is selected as Medium. You can change it to High or Low.
Running User: Specifies the logged-in user ID.
Shared: Select the checkbox to share the service with other users.
Custom Parameters: Configure the additional custom parameters for the Trino service.
Replicas: Configure the number of replicas or workers for the Trino service.
Enable Hive: Select the check box to enable Hive. Once enabled, You must enter the following:
Username: Enter the username of the platform (Lyve Cloud Compute) user creating and accessing Hive Metastore.
Container: Select big data from the list. This is the name of the container which contains Hive Metastore.
Hive Metastore path: Specify the relative path to the Hive Metastore in the configured container.
Select Create Servive.
On the Services page, select the Trino services to edit.
Select ellipses and then select Edit.
On the Edit service dialog, select the Custom Parameters tab.
Expand Advanced, in the Predefined section, and select the pencil icon to edit Hive.
Specify the following in the properties file:
Property name
Description
hive.s3.aws-access-key
Lyve cloud S3 access key is a private key used to authenticate for connecting a bucket created in Lyve Cloud. The access key is displayed when you create a new service account in Lyve Cloud. A service account contains bucket credentials for Lyve Cloud to access a bucket. For more information, see Creating service accounts.
hive.s3.aws-secret-key
Lyve cloud S3 secret key is private key password used to authenticate for connecting a bucket created in Lyve Cloud. The secret key displays when you create a new service account in Lyve Cloud. For more information, see Creating service accounts.
hive.s3.endpoint
Enter Lyve Cloud S3 endpoint of the bucket to connect to a bucket created in Lyve Cloud. For more information, see the S3 API endpoints.
hive.s3.ssl.enabled
Use the HTTPS to communicate with Lyve Cloud API. By default, it is set to true.
hive.s3.path-style-access
Use path-style access for all requests to access buckets created in Lyve Cloud. This is for S3-compatible storage that doesn’t support virtual-hosted-style access. By default it is set to false.
For more information about other properties, see S3 configuration properties.
On the Services page, select the Trino services to edit.
Select ellipses and then select Edit.
On the Edit service dialog, select the Custom Parameters tab.
Expand Advanced, to edit the Configuration File for Coordinator and Worker.
You can edit the properties file for Coordinators and Workers. Select the Coordinator and Worker tab, and select the pencil icon to edit the predefined properties file.
The following are the predefined properties file:
log properties: You can set the log level. For more information, see Log Levels.
JVM Config: It contains the command line options to launch the Java Virtual Machine. For more information, see JVM Config.
Config Properties: You can edit the advanced configuration for the Trino server. For more information, see Config properties.
Catalog Properties: You can edit the catalog configuration for connectors, which are available in the catalog properties file. For more information, see Catalog Properties.
Assign a label to a node and configure Trino to use a node with the same label and make Trino use the intended nodes running the SQL queries on the Trino cluster. During the Trino service configuration, node labels are provided, you can edit these labels later.
On the Services menu, select the Trino service and select Edit.
In the Edit service dialogue, verify the Basic Settings and Common Parameters and select Next Step.
In the Node Selection section under Custom Parameters, select Create a new entry.
Specify the Key and Value of nodes, and select Save Service.
Note
The values in the image are for reference.
Prerequisite before you connect Trino with DBeaver.
Download and Install DBeaver from https://dbeaver.io/download/.
Start DBeaver.
In the Database Navigator panel and select New Database Connection.
In the Connect to a database dialog, select All and type Trino in the search field.
Select Trino logo and select Next.
Select the Main tab and enter the following details:
Host: Enter the hostname or IP address of your Trino cluster coordinator.
Port: Enter the port number where the Trino server listens for a connection.
Database/Schema: Enter the database/schema name to connect.
Username: Enter the username of Lyve Cloud Analytics by Iguazio console.
Password: Enter the valid password to authenticate the connection to Lyve Cloud Analytics by Iguazio.
Select Driver properties and add the following properties:
SSL: Set SSL to True.
SSL Verification: Set SSL verification to None.
Select Test Connection.
If the JDBC driver is not already installed, it opens the Download driver files dialog showing the latest available JDBC driver. You must select and download the driver.
Select Finish once the testing is completed successfully.
When you create a new Trino cluster, it can be challenging to predict the number of worker nodes needed in future. The number of worker nodes ideally should be sized to both ensure efficient performance and avoid excess costs. Scaling can help achieve this balance by adjusting the number of worker nodes, as these loads can change over time.
Note
The Lyve Cloud analytics platform supports static scaling, meaning the number of worker nodes is held constant while the cluster is used.
On the left-hand menu of the Platform Dashboard, select Services.
Select the ellipses against the Trino services and select Edit.
Skip Basic Settings and Common Parameters and proceed to configure Custom Parameters.
In the Custom Parameters section, enter the Replicas and select Save Service.
Trino scaling is complete once you save the changes.
After you install Trino the default configuration has no security features enabled. You can enable the security feature in different aspects of your Trino cluster.
Authentication in Trino
You can configure a preferred authentication provider, such as LDAP.
You can secure Trino access by integrating with LDAP. After completing the integration, you can establish the Trino coordinator UI and JDBC connectivity by providing LDAP user credentials.
To enable LDAP authentication for Trino, LDAP-related configuration changes need to make on the Trino coordinator.
On the left-hand menu of the Platform Dashboard, select Services.
Select the ellipses against the Trino services and select Edit.
Skip Basic Settings and Common Parameters and proceed to configure Custom Parameters.
In the Advanced section, add the
ldap.properties
file for Coordinator in the Custom sectionConfigure the password authentication to use LDAP in
ldap.properties
as below.password-authenticator.name=ldap ldap.url=ldaps://<ldap-server>:636 ldap.user-bind-pattern=<Refer below for usage> ldap.user-base-dn=OU=Sites,DC=ad,DC=com
Property name
Description
ldap.url
The URL to the LDAP server. The URL scheme must be ldap:// or ldaps://.
It connects to the LDAP server without TLS enabled requires ldap.allow-insecure=true.
ldap.user-bind-pattern
This property can be used to specify the LDAP user bind string for password authentication. This property must contain the pattern ${USER}, which is replaced by the actual username during password authentication.
The property can contain multiple patterns separated by a colon. Each pattern is checked in order until a login succeeds or all logins fail.
For example: ${USER}@corp.example.com:${USER}@corp.example.co.uk
ldap.user-base-dn
The base LDAP distinguished name for the user trying to connect to the server.
For example: OU=America,DC=corp,DC=example,DC=com
Authorization based on LDAP group membership
You can restrict the set of users to connect to the Trino coordinator in following ways:
based on their group membership
by setting the optional ldap.group-auth-pattern property
In addition to the basic LDAP authentication properties. Add below properties in ldap.properties
file.
Property name | Description |
---|---|
ldap.group-auth-pattern | This property is used to specify the LDAP query for the LDAP group membership authorization. This query is executed against the LDAP server and if successful, a user distinguished name is extracted from a query result. Trino validates user password by creating LDAP context with user distinguished name and user password. |
For more information about authorization properties, see Authorization based on LDAP group membership.
Add the ldap.properties
file details in config.properties
file of Cordinator using the password-authenticator.config-files=/presto/etc/ldap.properties property:
Save changes to complete LDAP integration.
Note
You must configure one step at a time and always apply changes on dashboard after each change and verify the results before you proceed.
Once the Trino service is launched, create a web-based shell service to use Trino from the shell and run queries.
On the left-hand menu of the Platform Dashboard, select Services and then select New Services.
In the Create a new service dialogue, complete the following:
Basic Settings: Configure your service by entering the following details:
Service type: Select Web-based shell from the list.
Service name: Enter a unique service name. This name is listed on the Services page.
Description: Enter the description of the service.
Enabled: The check box is selected by default. Selecting the option allows you to configure the Common and Custom parameters for the service.
Common Parameters: Configure the memory and CPU resources for the service. The platform uses the default system values if you do not enter any values.
Note
When setting the resource limits, consider that an insufficient limit might fail to execute the queries.
Memory: Provide a minimum and maximum memory based on requirements by analyzing the cluster size, resources and available memory on nodes. Web-based shell uses memory only within the specified limit.
CPU: Provide a minimum and maximum number of CPUs based on the requirement by analyzing cluster size, resources and availability on nodes. Web-based shell uses CPU only the specified limit.
Priority Class: By default, the priority is selected as Medium. You can change it to High or Low.
Running User: Specifies the logged-in user ID.
Shared: Select the checkbox to share the service with other users.
Custom Parameters: Configure the additional custom parameters for the Web-based shell service.
Spark: Assign Spark service from drop-down for which you want a web-based shell.
Trino: Assign Trino service from drop-down for which you want a web-based shell.
Service Account: A Kubernetes service account which determines the permissions for using the kubectl CLI to run commands against the platform's application clusters.
Select CREATE SERVICE
Sample querying to Trino
After you create a Web based shell with Trino service, start the service which opens web-based shell terminal to execute shell commands.
Select the web-based shell with Trino service to launch web based shell.
The Web based shell launches in new tab.
Enter the Trino command to run the queries and inspect catalog structures.
Create a Schema with a simple query CREATE SCHEMA hive.test_123. After the schema is created, execute SHOW create schema hive.test_123 to verify the schema.
trino> create schema hive.test_123; CREATE SCHEMA trino> show create schema hive.test_123; Create Schema ------------------------------------------------------------------ CREATE SCHEMA hive.test_123 AUTHORIZATION USER edpadmin WITH ( location ='v3io://projects/user/hive/warehouse/test_123.db' )
Create a sample table assuming you need to create a table named employee using CREATE TABLE statement.
trino> CREATE TABLE IF NOT EXISTS hive.test_123.employee (eid varchar, name varchar, -> salary varchar, destination varchar); CREATE TABLE trino> SHOW CREATE TABLE hive.test_123.employee; Create Table --------------------------------------- CREATE TABLE hive.test_123.employee ( eid varchar, name varchar, salary varchar, destination varchar ) WITH ( format = 'ORC' )
Insert sample data into the employee table with an insert statement.
trino> INSERT INTO hive.test_123.employee VALUES ('1201', 'Mark','45000','Technical manager'); -> -> INSERT INTO hive.test_123.employee VALUES ('1202', 'Paul','45000','Technical writer'); -> -> INSERT INTO hive.test_123.employee VALUES ('1203', 'Allen','40000','Hr Admin'); -> -> INSERT INTO hive.test_123.employee VALUES ('1204', 'John','30000','Op Admin');
View data in a table with select statement.
trino> select * from hive.test_123.employee; eid | name | salary | destination ------+-------+--------+------------------- 1203 | Allen | 40000 | Hr Admin 1201 | Mark | 45000 | Technical manager 1202 | Paul | 45000 | Technical writer 1204 | John | 30000 | Op Admin