October 23, 2017
Spark and Qlik Integration
                    Steps:
- Start Spark Thrift Server on Datastax Cluster
 
$ dse -u cassandra -p <password> spark-sql-thriftserver start --conf spark.cores.max=4 --conf spark.executor.memory=2G --conf spark.driver.maxResultSize=1G --conf spark.kryoserializer.buffer.max=512M --conf spark.sql.thriftServer.incrementalCollect=true
- Enable Qlik Server’s Security Group on AWS to access port 10000 (basically from qlik, need to connect to thrift server port 10000)
 - Install Simba ODBC Driver for Spark on the Qilk Windows EC2 Instance
Create System DSN as follows: 
| Spark Server Type: | SparkThriftServer | 
|---|---|
| Host: | internal-spark-thriftserver-prod-lb-861234576.ap-southeast-1.elb.amazonaws.com (DNS name of spark thrift server ELB) | 
| Port: | 10000 | 
| Database: | avm_analytics | 
| Authentication Mechanism: | Username | 
| Thrift Transport: | SASL | 
- Now go to Qlik Admin UI -> Data Connections, click on above DSN, it gets connected
 - In the Data Editor, give below to execute query
 

LIB connect TO ‘Simba Spark(Qlik-sense-administration)’  
select txn_id,txn_date from transactions where txn_date>=‘2017-06-05’ and txn_date<‘2017-06-06’  
- Observe the execution of spark job in Spark Web UI