Connecting to Apache Cassandra

Apache Cassandra is a NoSQL database management system. Use the Apache Cassandra data connector to import your company’s Cassandra data.

Before you begin

To connect to Cassandra, you need to collect the following:

  • the hostname or IP address of the database server
  • the correct connection port
  • your username and password if using authentication

For help establishing the connection prerequisites, contact the Cassandra administrator at your organization. If your administrator cannot help you, you or your administrator should contact Cassandra Support.

Create a Cassandra connection

  1. From the Analytics main menu, select Import > Database and application.
  2. From the New Connections tab, in the ACL Connectors section, select Cassandra.
  3. In the Data Connection Settings panel, enter the connection settings, and at the bottom of the panel, click Save & Connect.

    You can accept the default Connection Name or enter one new.

  4. p>

The Cassandra connection is saved in the Existing Connections tab. In the future, you can reconnect to Cassandra from the saved connection.

Once the connection is established, the Data Access window opens in the Staging Area and you can start importing data. For help importing Cassandra data, see Working with the Data Access window.

Connection Settings

Basic Settings

Setting Description Example Host

The IP address or hostname of the Cassandra server.

Port The TCP port for the Cassandra database. 9042 Default Keyspace The default keyspace (schema) to connect to in Cassandra. Authentication Mechanism

The authentication mechanism to use to connect to the Cassandra server. The available options are as follows:

  • No Authentication
  • Username and Password

No Authentication Username The username to be used. will use to access the Cassandra server. Password The password corresponding to the username provided.

Advanced Settings

Setting Description Example Query Mode

Specifies the query mode to use when sending queries to Cassandra. The available options are:

  • SQL: Use SQL_QUERY_MODE and execute all SQL queries.
  • CQL: uses CQL_QUERY_MODE and runs all queries in CQL.
  • SQL with CQL Fallback – Uses SQL_WITH_CQL_FALLBACK_QUERY_MODE and runs all queries in SQL by default. If a query fails, the driver executes the query in CQL.

SQL with CQL Fallback Adjustable Consistency The specific Cassandra replica or the number of Cassandra replicas that must process a query for the query to succeed . A Load Balancing Policy Specifies the load balancing policy to use. Binary Column Length The default column length for reporting BLOB columns. 4000 String Column Length The default column length for reporting ASCII, TEXT, and VARCHAR columns. 4000 Virtual table name separator The separator for naming a virtual table created from a collection. The name of a virtual table consists of the original table name, then the separator, and then the collection name. _vt_ Enable Token Aware Specifies whether to use a token aware policy to improve load balancing and latency. Enable latency awareness Specifies whether the controller should use a latency awareness algorithm to distribute the load away from slower performing nodes. Enable Null Insertion Specifies whether the driver should insert all NULL values ​​as specified in INSERT statements. Enable case

Specifies whether the driver differentiates between uppercase and lowercase letters in schema, table, and column names.

If this option is enabled, all schemas, tables, and columns must be enclosed in double quotes (“).

Use SQL_WVARCHAR for string data types Specifies whether to use SQL_WVARCHAR for text and string data types. varchar Enable Paging Specifies whether to split large result sets across pages Rows Per Page When the Enable Paging option is enabled, use this option to specify the maximum number of rows to display on each page. 10000 SSL Options

Specifies how the driver uses SSL to connect to the Cassandra server.The available options are:

  • No SSL – The driver does not use SSL.
  • One-way server verification – If the Enable server hostname verification option is enabled, the client verifies the Cassandra server using SSL Otherwise, the controller connects to the Cassandra server using SSL, but the client and server r do not verify each other.

  • Two-way server and client verification: if the Enable server hostname verification option is enabled, the Cassandra client and server verify each other using SSL. Otherwise, the controller connects to the Cassandra server using SSL, but the client and server do not verify each other.

No SSL Enable Server Hostname Verification Specifies whether the driver requires the server’s hostname to match the hostname in the SSL certificate. Ssltrustedcertspath The full path to the .pem file containing the certificate to verify the server. Client Side Certificate The full path to the .pem file containing the certificate to verify the client. Client Side Private Key The full path to the file containing the private key used to verify the client. Key File Password The password for the private key file that is specified in the Private Key field on the client side.

Query Cassandra

One advantage of the Apache Cassandra design is the ability to store data that is not normalized in fewer tables. By taking advantage of nested data structures such as sets, lists, and maps, transactions can be simplified. However, Analytics does not support access to this type of data. By normalizing the data contained in the collections (arrays, lists, and maps) into virtual tables, the connector allows users to interact directly with the data, but leaves the storage of the data in its non-normalized form in Cassandra.

If a table contains collection columns, when the table is first queried, the connector creates the following virtual tables:

  • A “base” table, which contains the same data than the real table except for the columns in the collection.
  • A virtual table for each column in the collection, which spans the nested data.

The tables virtuals refer to the data in the real table, which allows the connector to access the denormalized data. By querying virtual tables, you can access the contents of Cassandra collections via ODBC.

The base table and virtual tables appear as additional tables in the list of tables that exist in the database . The base table uses the same name as the actual table it represents. Virtual tables that represent collections are named using the actual table name, a separator (_vt_ by default), and the column name.

Data Connector Updates

When upgrade Analytics, Robots Agent, or AX Server, you should test any of your scripts that import data using one of the Analytics data connectors (ACCESSDATA command).

There is a possibility that changes made by third-party data sources or ODBC driver providers required updates to one or more of the data connectors. Scripted data connections may need to be updated to continue to work correctly.

  • Run the import again The easiest way to update a connection is to manually perform an import using the Data Access window in the enhanced version of Analytics. Copy the ACCESSDATA command from the registry and use it to update your script.
  • Update the field specifications You may also need to update the field specifications in the body of the script to align with changes to the table schema in the data source or ODBC driver. Possible changes include field names, field data types, and field and record lengths.
  • Verify the results of any filtering You should also verify the results of any filtering you apply as part of the data import . Confirm that the import filtering includes and excludes records correctly.

Apache Cassandra Data Connector Changes

Specific changes made to the Apache Cassandra Data Connector are listed below.

Change Analytics version

14.2

The connector no longer supports connection to Apache Cassandra 2.0.

Connections to Apache Cassandra 2.1, 2.2, and 3.0 can be made.

.

See Also:  Top 10 Resume Tips for 2015

Leave a Reply

Your email address will not be published. Required fields are marked *