Category: Datastax

AWS
Cloud
Datastax
NoSQL
May 28, 2019

How to create a simple Cassandra Cluster on AWS

What is Cassandra? Apache Cassandra is a free and open-source distributed wide column store NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Wikipedia Apache Cassandra is a high performance, extremely scalable, fault tolerant (i.e. no single point of failure), distributed post-relational database solution. Cassandra […]

Read More
Benchmarking
Datastax
NoSQL
March 12, 2018

Datastax Cassandra benchmark on OCI

In this blog, we show how we created a Datastax Cassandra cluster on Oracle Cloud Infrastructure (OCI) using Terraform and benchmark Oracle Cloud baremetal machines running Cassandra stress. 1. Steps to Create the Cluster Go here to download the Terraform projectFollow the guide to setup your environment Edit the file env-vars and fill in all the relevant […]

Read More
Business Intelligence
Datastax
Spark
October 23, 2017

Spark and Qlik Integration

Steps: Start Spark Thrift Server on Datastax Cluster Enable Qlik Server’s Security Group on AWS to access port 10000 (basically from qlik, need to connect to thrift server port 10000) Install Simba ODBC Driver for Spark on the Qilk Windows EC2 InstanceCreate System DSN as follows: Spark Server Type: SparkThriftServer Host: internal-spark-thriftserver-prod-lb-861234576.ap-southeast-1.elb.amazonaws.com (DNS name of […]

Read More
Datastax
Spark
October 16, 2017

Accessing Datastax Spark – Basic Examples

DSE 5.0.7, Spark 1.6.3 Accessing Spark Accessing spark from outside the spark cluster Spark SQL Pyspark Scala Spark Job Spark Thrift / Beeline note Difference between thrift query and spark-sql query, is that thrift server uses a shared SparkSQL context whereas Spark SQL does not. Accessing Cassandra Materialized Views from Spark users_by_email_mview is a materialized […]

Read More
AWS
Datastax
NoSQL
October 7, 2017

Datastax Spark on AWS

Configuration: DSE 5.0.6 (See Datastax Cassandra on AWS for Installation Details) so when you start dse spark or dse spark-sql, in spark UI, you can see 3 out of 4 cores allocated Verification: Finding the spark master Spark Web UI: http://<node1>:7080Now in http://:4040/stages/, we can see 1 Fair Scheduler Pool, with schedule mode as FAIRFrom AWS […]

Read More
Datastax
October 28, 2016

Cassandra Monitoring

Choice of Tool New Relic http://newrelic.com/plugins/3legs/113Monitor Cassandra statistics using the 3legs plugin. Metrics include Read and Write latency (global and per host), Cache statistics, Pending compactions, flushes and more. Datastax – OpsCenter http://www.datastax.com/documentation/opscenter/4.1/pdf/opscuserguide41.pdfDataStax OpsCenter is a visual management and monitoring solution for Apache Cassandra and DataStax Enterprise.The DataStax agents are installed on the Real-time (Cassandra), […]

Read More